Skip to content

Test metrics

Sangchun Ha edited this page Jul 16, 2021 · 5 revisions

Metrics

We conducted a test to check whether character error rate and word error rate were working properly.
We will explain with examples in the order of LibriSpeech, AISHELL-1, and KsponSpeech.

LibriSpeech

Labels

unit id
< pad > 0
< eos > 1
< sos > 2
< blank > 3
4
A 5
D 6
E 7
H 8
I 9
O 10
R 11
S 12
T 13
W 14
Y 15
? 16
TARGET = "How is the weather today?"
PREDICTION = "How are the weather today?"

Character Error Rate


When you calculate character error rate, you should calculate after removing all spaces.
So target becomes "Howistheweathertoday?", and the prediction becomes "Howaretheweathertoday?".

The distance between target and prediction is 3.
The length of target sequence is 21.

CER = distance / length = 3 / 21 = 0.14

Word Error Rate


Levenshtein package only accepts string. So, we use the split function to separate each word by space, and convert one cut word into ASCII code.

For example,
"How is the weather today?" -> "How", "is", "the", "weather", "today?" -> "A", "B", "C", "D", "E"
"How are the weather today?" -> "How", "are", "the", "weather", "today?" -> "A", "F", "C", "D", "E"

The distance between target and prediction is 1.
The length of target sequence is 5.

WER = distance / length = 1 / 5 = 0.2

AISHELL-1

Labels

unit id
< pad > 0
< eos > 1
< sos > 2
< blank > 3
4
5
6
7
8
9
10
? 11
TARGET = "今天天气怎么样?"
PREDICTION = "今天天气怎样?"

Character Error Rate


When you calculate character error rate, you should calculate after removing all spaces.
So target becomes "今天天气怎么样?", and the prediction becomes "今天天气怎样?".

The distance between target and prediction is 1.
The length of target sequence is 8.

CER = distance / length = 1 / 8 = 0.12

Word Error Rate


Levenshtein package only accepts string. So, we use the split function to separate each word by space, and convert one cut word into ASCII code.

For example,
"今天天气怎么样?" -> "今天天气怎么样?" -> "A"
"今天天气怎样?" -> "今天天气怎样?" -> "B"

The distance between target and prediction is 1.
The length of target sequence is 1.

WER = distance / length = 1 / 1 = 1.0

KsponSpeech

Labels

unit id
< pad > 0
< eos > 1
< sos > 2
< blank > 3
4
5
6
7
8
9
10
11
? 12
TARGET = "오늘 날씨는 어때?"
PREDICTION = "오늘 날씨는?"

Character Error Rate


When you calculate character error rate, you should calculate after removing all spaces.
So target becomes "오늘날씨는어때?", and the prediction becomes "오늘날씨는?".

The distance between target and prediction is 2.
The length of target sequence is 8.

CER = distance / length = 2 / 8 = 0.25

Word Error Rate


Levenshtein package only accepts string. So, we use the split function to separate each word by space, and convert one cut word into ASCII code.

For example,
"오늘 날씨는 어때?" -> "오늘", "날씨는", "어때?" -> "A", "B", "C"
"오늘 날씨는?" -> "오늘", "날씨는?" -> "A", "D"

The distance between target and prediction is 2.
The length of target sequence is 3.

WER = distance / length = 2 / 3 = 0.66

Clone this wiki locally