Editing Guidelines v.1.12: Transcribio
Editing Guidelines v.1.12: Transcribio
12
(Last updated: Jan 21, 2022)
transcribio®
Table of contents
Speaker anonymization 2
Name fields 3
General 3
Keyboard shortcuts 4
Typing/speed 4
Web search 4
Grammar 6
Passages 6
[confidential] 7
[inaudible] 8
[crosstalk] 8
[unsupported language] 9
Numbers 10
Profanity 12
Punctuation 12
VIP transcripts 12
Spelling 13
Annexes 14
Example corrections 14
Texicon corrections 15
1
Speaker anonymization
The names of the speakers participating in the call should always be anonymized when
mentioned in text, according to their type:
● [Company] - for the interviewer company speaker name and interviewer company name
mentions. For multiple company speakers use the same notation [Company], without
numbers. In the name field interviewer speakers are identified as “You”.
● [Expert] - for the expert name mentions.
● [proSapient] - for the proSapient contact person name mentions. (proSapient is the call
vendor company and may be mentioned as well.)
Not anonymized:
Anonymized:
For all other name mentions, they should be accurate and not anonymized, e.g. expert’s
current and previous companies, companies of interest/competitors and their employees, public
figures, Expert’s colleagues etc. (with the exception of the experts the interviewer company
speakers have talked to before, if mentioned).
Not anonymized:
2
Anonymized:
Name fields
The names of the speakers are not always autofilled in the name fields. If you see Unknown_#
instead of “Expert” or “You” (the interviewer), always select the appropriate speaker by clicking
on the name field.
The ability to select another speaker in the name field is only present where you see
Unknown_# instead of “Expert” or “You”.
Always select the same speaker for the same Unknown_# in different name fields, i.e. if you
select “You” for Unknown_1 in one name field, only select “You” in the rest of the name fields
with Unknown_1.
General
Transcripts should be accurate and readable. While the accuracy point is self-descriptive,
readability means removing word-for-word repetitions and frequent false starts, as well as filler
words if speakers overuse them, e.g.
I was wondering I was actually wondering if you could → I was actually wondering if you could
very good yes indeed very good, thanks thank you thank you very much indeed → very good
yes thank you very much indeed
3
here’s the, what I’m trying, we never, or let me put it this way, we considered a few options → let
me put it this way, we considered a few options
I mean, we have kind of, you know, right? kind of considered you know, that option, right? → I
mean, we have kind of considered that option, right?
Do not remove filler words if they help preserve the meaning of the sentence, or add them if
they’re missing, e.g.
here’s what when I say best-in-class → here’s what I mean when I say best-in-class
do of any other competitor except Abc Company → do you know of any other competitor except
Abc Company
The dialogue should be accurately structured and make sense. If it doesn’t make sense, it’s not
accurate.
Do not add listening responses/affirmations (like “uh-huh,” “hmm,” “yeah” etc.) if they are not
automatically transcribed, unless they answer a specific question, e.g.
Please only send to review (mark as done) a transcript as ready-to-publish, i.e. do not make any
comments or use any markings not covered in these guidelines.
Keyboard shortcuts
Ctrl + Space Play/stop audio
4
Typing/speed
Transcripts should be delivered as fast as possible. If you do not yet touch type, you can
practice on any speed typing website. For instance, https://www.speedtypingonline.com/.
Web search
In every call, company names, brands, people, acronyms, terminology, locations etc. will be
mentioned that you will not be familiar with. These should be accurate and spelled correctly. To
find the correct spelling, use web search.
The fastest way to find information is by using the variations of what you hear + speech context,
e.g.
In some cases, just the context will give you the result, e.g.
To find employees of specific companies, add their job title (and “LinkedIn”) to a search string, or
look up the company website for top management positions.
To find something that belongs to a list of things, like airports in London, islands in Denmark,
supermarkets in Germany, telcos in South America, use Wikipedia.
To find brands/products of a specific company, look up the company’s website.
To aid search for a specific country-based company, use: Ag/GmbH for Germany, SARL for
France, Srl/SpA for Italy, SL for Spain, AB for Sweden, Ltd. for UK etc., or the country’s top-level
web domain, i.e. de, fr, it, nl (Netherlands), es, ch (Switzerland), se etc.
To find geographic locations, map browsing is sometimes the best option, e.g.
5
Tip: Google may show results based on your past search history, which is not ideal for finding
exact information across a broad range of topics. As an alternative to Google, try using
https://duckduckgo.com/.
Grammar
Sentences should be grammatically correct. For non-native speakers who make grammatical
mistakes, make the appropriate corrections so that the text is readable, e.g.
You was asking question on these a topic and I going to say → You were asking a question on
this topic and I was going to say
we make switch of start of 2020 a one years ago → we made the switch at the start of 2020 one
year ago
At the same time, do not rephrase or restructure sentences; do not correct grammar for native
speakers.
Passages
Every change of speaker type between company interviewer (“You”) and Expert should be
represented with a new passage. In other cases for regular tasks, keep the autotranscribed
6
passage structure1. If the system did not recognize that the type of speaker has changed/did not
transcribe their response, add the missing passage(s) or split the existing passage into several,
e.g.
To add a new passage, click on the + button to the left of the name field and select the speaker.
Alternatively, you can add missing responses (regardless of length) inside the existing passage,
identifying speaker change from the new line followed by a colon and without square
brackets. Do not identify the first speaker in a passage (which is already identified in the name
field), e.g.
1
For VIP tasks, always merge uninterrupted speech of a single speaker represented as several
consecutive passages into a single passage. For regular tasks, always merge uninterrupted speech of a
single speaker represented as several consecutive passages - where some of them are identified as the
wrong speaker - into a single passage.
7
If the system created a passage where there’s nobody speaking or if you’ve mistakenly created
a new passage for the wrong speaker, deactivate the passage with the delete button. You can
reactivate the deactivated passage by clicking on the delete button again.
[confidential]
Sharing personal information - emails, phone numbers, addresses - is strictly prohibited. If
speakers are sharing personal information, replace it with [confidential], e.g.
I can give you my email it’s [email protected] spelled as my name J-O-E B-L-O-G-G-S
@email.com → I can give you my email it’s [confidential]
[inaudible]
For places you cannot understand, use [inaudible]. Avoid using several [inaudible]’s in a row,
e.g.
For unambiguously sounding acronyms, company names or general text that you cannot find or
are not sure about, put them inside:
8
[inaudible: Stompf ODGYN]
we IPOed right after the [inaudible: ‘08 recession hit] which was good timing indeed
[crosstalk]
For places you cannot hear because of speakers talking over each other, use [crosstalk]. Do not
put [inaudible] and [crosstalk] together: use only [inaudible] in such cases. Do not end one
passage with [crosstalk] and start the next passage with [crosstalk]: only use one [crosstalk] in
such cases, e.g.
Do not use [crosstalk] for false starts, listening responses not meaningful to the conversation.
Do not capitalize “crosstalk” in square brackets regardless of its position in a sentence.
[unsupported language]
For specific words and phrases mentioned not in English, try and find them using web search +
speech context and/or Google Translate, matching what you find to what you hear, e.g.
with a [unsupported language] how do you say it yes laser engraving → with a Laser-Gravur
how do you say it yes laser engraving
https://translate.google.com/?hl=en#view=home&op=translate&sl=en&tl=de&text=laser%20engr
aving
the [unsupported language] in Germany or the green point, in France this is [unsupported
language] → the grüner Punkt in Germany or the green point, in France this is point vert
https://translate.google.com/?hl=en#view=home&op=translate&sl=en&tl=de&text=green%20poi
nt
https://translate.google.com/?hl=en#view=home&op=translate&sl=en&tl=fr&text=green%20point
9
in Germany with the new packaging law [unsupported language] → in Germany with the new
packaging law Verpackungsgesetz
https://translate.google.com/?hl=en#view=home&op=translate&sl=en&tl=de&text=packaging%2
0law
https://duckduckgo.com/?q=Verpackungsrecht&ia=web
In case you cannot find them or when speakers are talking in a language other than English,
use [unsupported language].
In case speakers are talking in a language other than English for up to 4 passages, put
[unsupported language] in each passage, e.g.
For 5 and more passages, put [unsupported language] from the new line from the point
speakers switch to non-English and deactivate all further passages until they switch back to
English, e.g.
10
If you get a task in a language other than English, or speakers decide to continue the
conversation in any language other than English at any point in the first part of the call after
saying hello, reject it due to “Unsupported language.”
Numbers
For number format, use the following table:
k (thousands) one k 1K
hundred K 100K
6 ks (plural) 6 k’s
11
eleven twelve percent 11-12%
Profanity
If speakers are cursing, transcribe as is, without censoring.
Punctuation
For regular tasks, ignore punctuation after the first five lines in the first part of the transcript to
save time, i.e. do not edit punctuation that was autotranscribed and do not add any missing
punctuation; do not capitalize the first word in a sentence, e.g.
Regular VIP
so I’m [Expert] based, Here in. Liverpool UK So I’m [Expert], based here in Liverpool, UK.
my. experience is mostly Asia Pac and I also, My experience is mostly Asia Pac, and I also
have some Knowledge on Australian and, have some knowledge on Australian and New
New Zealand IFAs? Zealand IFAs.
Do not use the exclamation mark unless it belongs to a name, like Yahoo!
12
VIP transcripts
For VIP transcripts, the whole text should be correctly structured into sentences and fully
punctuated.
If a sentence begins with “So”, do not put a comma after it, e.g.
So, why don’t we kick off with a brief introduction on your part? → So why don’t we kick off with
a brief introduction on your part?
Avoid constantly using And and So at the beginning of the sentence if speakers overuse it.
For direct speech, use double quotes and a comma after the introductory clause, e.g.
I said, “Okay, let’s try G Suite” - we weren’t using it at the time - “and see if we can reduce our
cost by more than 15%.”
They went, “Listen, we cannot guarantee 100% uptime.” “Oh, then we’ll go with your competitor
who can, unless you can reduce the cost by 15%, annually.” “Fine, you got yourself a deal.”
For an unfinished/interrupted thought, use ellipses (three periods), without space or any
adjacent punctuation, e.g.
When possible, keep sentences shorter rather than longer. Avoid using () parentheses.
Spelling
Use American spelling for all speakers. If the autotranscribed text has British spelling, use
British.
13
Be consistent in spelling/capitalizing company names: if you find several variants (e.g. IQVIA vs
Iqvia, Revtrac vs Rev-Trac etc.), choose the one dominating in search results and stick to it.
For acronyms, make sure you know what the acronym stands for, otherwise you will get it
wrong, e.g.
When speakers are spelling something out, use caps and dashes, e.g.
Make sure to use spell checker for typos etc. To enable one in Google Chrome, go to Settings
→ Advanced → Languages → Spell check.
14
Annexes
Example corrections
PwC strategy and → PwC Strategy&
spot of the, Material Handling → as part of Toyota Material Handling
youth. You see, you, been involved → you've been involved
my way of my way of background → by way of background
Software oppression → Software solution
700 and please → 700 employees
local prison → local presence
SAP success factors → SAP SuccessFactors
Fortune 500,000 companies → Fortune 500, 1000 companies
he RPO that business → he IPOed that business
will be big joined → will be joined
dot your eyes and cross your teeth → dot your i's and cross your t's
FCN support → SEN support || here: special educational needs
EPL or above the line → ATL or above the line
but I wanted to ask this question on the list → but I wanted to ask this question nonetheless
it's an excessive 5% → it's in excess of 5%
food safety modernization act → Food Safety Modernization Act
we wanted to a sir pain A how big is the market and B whore the major competitors → we
wanted to ascertain (a) how big is the market and (b) who are the major competitors
from subprime E to supplier B → from supplier A to supplier B
middle east → Middle East
per employee per month, that's your pepper → per employee per month, that's your PEPM
While the first option can be used in all cases, the second is meant only for unique corrections
that would always be corrected the same way manually whenever a particular word or word
combination is automatically transcribed by an automated speech recognition engine.
Examples of texicons:
15
Ana plan → Anaplan
Mac books → MacBooks
era P → ERP
us foods → US Foods
far east → Far East
we need, we need → we need | repetition
CAP gaming night → Capgemini
cannot cannot → cannot | repetition
Please make sure any word pair you replace with All texts is unique as it will affect all further
transcripts.
As long as what you put in the left side can be used in a written sentence, it’s not a texicon and
should not be replaced with All texts.
16