I used an AI code to translate any chinese subbed video

This user has no status.
This user has no status.
Member
Mar 2022
26
16
54
Hi, my name is Isaac and some months ago I started working on a code to translate chinese videos. It was so difficult to understand these fantastic videos where CNT players explained things that I have never seen or taught. I decided to translate them so everyone can learn about chinese technique. I created a youtube channel but was insta-banned cause Yin Hang, former CNT member bombed me with copyright issues. The channel was closed in 3 days. So I started uploading them in Dailymotion:


The problem is that with Dailymotion I can't get to more users, as it's not like youtube. So after I found another source for videos, I decided to start a new youtube channel, lets hope I dont get report-bombed.


The purpose of this channel is to bring table tennis content to everyone, without the borders of language. Feel free to suggest me videos so I can translate them! If the video has some sort of copyright, I'll upload it to Dailymotion, don't worry, there I can upload block users by location so only non-chinese can watch them.

Please, feel free to suggest new content or ideas!
 
This user has no status.
This user has no status.
Member
Mar 2022
26
16
54
Great idea. I would love to get more knowledge from the Chinese.
The algorithm seems to require some more work:
View attachment 26132
Some sentences don't make sense, and many have parts that don't either.
Anyway, keep working, I'd love to see more of that.
Thanks!
Thanks BaRanchik, I'm aware that sometimes the translations are not accurate, sometimes because IA image recognition gets crazy or because other text gets recognised by the IA, for example, when a brand printed in the table gets in the translation zone. I don't know chinese, but I think that this is better than nothing. From what I have tested, 80% of the translations are accured, and they give a context. I'm also working in better detection using filters on image, but they are video dependent, so I have to tune for each video.


BTW, has your LAC bubbled? I have seen one post here with multiple people having this problem
 
This user has no status.
This user has no status.
Member
Apr 2023
167
213
594
Thanks BaRanchik, I'm aware that sometimes the translations are not accurate, sometimes because IA image recognition gets crazy or because other text gets recognised by the IA, for example, when a brand printed in the table gets in the translation zone. I don't know chinese, but I think that this is better than nothing. From what I have tested, 80% of the translations are accured, and they give a context. I'm also working in better detection using filters on image, but they are video dependent, so I have to tune for each video.


BTW, has your LAC bubbled? I have seen one post here with multiple people having this problem
In one of my courses at the university we were shown an example for the use of the Least-Squares algorithm. The professor showed us how to use it to identify hand-written digits from the matrix that is the pixels of the image. I don't know if there are easier ways to do that nowadays, but maybe this can help somewhat. Unfortunately, this was over 3 years ago, so I remember nothing about the actual implementation.

Regarding the LAC, so far so good. Had 7 practices with it since getting it, still looks brand new, and plays GREAT. I am hoping this sheet survives for months and months, and that everyone else got fakes so mine is not just an exception. :ROFLMAO:
 
This user has no status.
This user has no status.
Well-Known Member
Sep 2013
7,557
6,741
16,389
Read 3 reviews
this is going to be a hard one to make it perfect
quite a few mistakes, and I didn't even watch one minute.

In terms of your copyright claim, you should have a look at the rules surrounding that.
Ie, make it picture in picture or something like that.
don't just use subtitles
 
  • Like
Reactions: blahness
This user has no status.
This user has no status.
Well-Known Member
Jul 2017
1,772
856
2,947
AI is over hyped because what I consider true AI exists only in machines like IBMs Watson. True AI doesn't need to exist everywhere to be disruptive. For years computers can and have been programmed to perform certain specific jobs better than humans but I wouldn't call it AI. Chess is an example. However, some Chess programs have been able to teach themselves how to play after just being taught the basic rules. Leela and Stockfish are examples.
BTW, if you want a good chess program download Stockfish.
 
This user has no status.
This user has no status.
Member
Mar 2022
26
16
54
AI is over hyped because what I consider true AI exists only in machines like IBMs Watson. True AI doesn't need to exist everywhere to be disruptive. For years computers can and have been programmed to perform certain specific jobs better than humans but I wouldn't call it AI. Chess is an example. However, some Chess programs have been able to teach themselves how to play after just being taught the basic rules. Leela and Stockfish are examples.
BTW, if you want a good chess program download Stockfish.
I can understanding what you are saying. But in this case I'm working with a true IA to recognise symbols as the Chinese ones. That IA was trained with millions of symbols so the accuracy is good. For further info, I'm using Google cloud API for text recognising.
 
This user has no status.
This user has no status.
Well-Known Member
Sep 2013
7,557
6,741
16,389
Read 3 reviews
I can understanding what you are saying. But in this case I'm working with a true IA to recognise symbols as the Chinese ones. That IA was trained with millions of symbols so the accuracy is good. For further info, I'm using Google cloud API for text recognising.
I'm not sure how much you understand of Chinese characters/words
but being thousand years in the making and so many new words made just for table tennis, and there are even newer ones to come, the direct translation and actual meaning are two difficult things.

ie, PULL the ball = Loop the ball
and then you have Hit arc ball = loop the ball too

should be loop a high arc

but pull and high-profile is the correct direction translation.

cricket is correct direct translation

what he is saying is 10 hits on the ball.
he can leave out "ball" and it is still fine, and then direct will be 10 rackets
In this context, it is 10 hits.
how do you train AI to know when to use racket or hits? or overide something like "cricket" (as an example) but next time, they are talking about cricket.
 
  • Like
Reactions: isaacdl
This user has no status.
This user has no status.
Member
Apr 2023
167
213
594
how do you train AI to know when to use racket or hits? or overide something like "cricket" (as an example) but next time, they are talking about cricket.
I'm afraid OP himself is not the one actually training the AI, based on what he wrote to brokenball. What you usually do is get as many samples as you can, and train the AI (this is not really AI, it's a machine learning algorithm). If most results are irrelevant, then you'll be getting an irrelevant response, like talking about cricket or whatever. You can add your own bias (so make it TT related) by making TT-related results more rewarding for the algorithm, but this is diving into the mathematics of the algorithm.

By using an existing algorithm you have no control over, I am not sure there is a way to really make this work accurately.
 
This user has no status.
This user has no status.
Well-Known Member
Sep 2013
7,557
6,741
16,389
Read 3 reviews
I'm afraid OP himself is not the one actually training the AI, based on what he wrote to brokenball. What you usually do is get as many samples as you can, and train the AI (this is not really AI, it's a machine learning algorithm). If most results are irrelevant, then you'll be getting an irrelevant response, like talking about cricket or whatever. You can add your own bias (so make it TT related) by making TT-related results more rewarding for the algorithm, but this is diving into the mathematics of the algorithm.

By using an existing algorithm you have no control over, I am not sure there is a way to really make this work accurately.
Chinese is too tough to train I think.

Chinese is made up with so many words
each, they mean something in TT, when combined together, they could mean something else in TT
Then you have how the style of speaking by the person, and the maybe, different jargon, they will use

It is easy to get Chinese subtitles from Chinese Audio.
So if this works out, this will be great for all western TT community.
But knowing both languages, it won't be an easy process.
 
  • Like
Reactions: isaacdl
This user has no status.
This user has no status.
Well-Known Member
Nov 2022
1,102
1,462
4,047
Maybe there's a way to adjust the algorithm such that it recognizes certain characters and translates using manually provided translations.

So you can just make a list of say 100 or so of the most common technical TT terms (e.g. topspin, loop, back spin, drive, smash, recovery, salute, etc.) and force the algo to use those translations when the characters appear. The algo can handle the rest.
 
  • Like
Reactions: isaacdl
This user has no status.
This user has no status.
Well-Known Member
Jul 2017
1,772
856
2,947
Chinese is too tough to train I think.

Chinese is made up with so many words
I think you mean symbols. All languages have lots of words. I have a document of the 3000 most commonly used Chinese symbols and their meanings. A computer data base can store 3000 symbols and their meanings easily. The problem is that the symbols must be scanned and then scaled up or down to match the size of those symbols in the database. Actually, once this is done the rest of the characters should use the same scale. A bigger problem is that some symbols can have 6 different meanings and even be pronounced differently depending on the context. Getting the computer use the right context is difficult too. Then there are the idiomatic sayings. 对牛弹琴 is my favorite. A literal translation is playing the harp ( qin ) for cows. Harp is a loose translation. The qin is an old Chinese instrument. The literal meaning is much different from the intended meaning. Somethings will not translate well. We say loop, in China they say pull ball. la qiu. So often there isn't a word for word translation. Matching symbols is the easy part of an extremely difficult project.
 
  • Like
Reactions: isaacdl
This user has no status.
This user has no status.
Member
Mar 2022
26
16
54
Hi, thanks for your replies!. Yes, the IA is already trained, but dont worry, in my code I can change the words automatically without any problem.
I already knew that pull=loop and high profile mean a slow,high arc topspin, but the code was in a preliminary step so I focused on other things. I can make those changes so everything makes more sense easily. About the same topic, another word that I also find which appears in my code is hair. From context, I think the code want to mean something like spin or brush? Correct me if I'm wrong!

I wish I could learn chinese so I could make this translations by hand.
Maybe there's a way to adjust the algorithm such that it recognizes certain characters and translates using manually provided translations.

So you can just make a list of say 100 or so of the most common technical TT terms (e.g. topspin, loop, back spin, drive, smash, recovery, salute, etc.) and force the algo to use those translations when the characters appear. The algo can handle the rest.
Exactly, this is the simple way to fix it. I already do this for "" and for other weird words. If u can add other changes for translations would be awesome, so maybe in the future we can all have perfect translations.
 
This user has no status.
This user has no status.
Member
Mar 2022
26
16
54
Btw, I didnt upload the youtube channels where I take the videos from to translate. I can upload these in dailymotion, as they have a watermark which makes them impossible to upload in youtube:


If you know where this videos are taken from, tv china or bilibili without watermark would be awesome to avoid copyright problems. I'm 100% sure those channels are not ruled by the coaches in the video, but also stolen from other site or platform.

It's a pity that those videos are not translated, so everyone can learn about ,which is in my opinion, the best technique for table tennis.
 
This user has no status.
This user has no status.
Well-Known Member
Sep 2013
7,557
6,741
16,389
Read 3 reviews
I think you mean symbols. All languages have lots of words. I have a document of the 3000 most commonly used Chinese symbols and their meanings. A computer data base can store 3000 symbols and their meanings easily. The problem is that the symbols must be scanned and then scaled up or down to match the size of those symbols in the database. Actually, once this is done the rest of the characters should use the same scale. A bigger problem is that some symbols can have 6 different meanings and even be pronounced differently depending on the context. Getting the computer use the right context is difficult too. Then there are the idiomatic sayings. 对牛弹琴 is my favorite. A literal translation is playing the harp ( qin ) for cows. Harp is a loose translation. The qin is an old Chinese instrument. The literal meaning is much different from the intended meaning. Somethings will not translate well. We say loop, in China they say pull ball. la qiu. So often there isn't a word for word translation. Matching symbols is the easy part of an extremely difficult project.
character, word, symbols, I'm not sure what you call it in English. I'm not a Chinese language expert.

Chinese words, used by traditional Chinese writing has over 100 000 words
scholars have struggled to give a clear number,
Simplified Chinese was made to scale down the number of words

maybe this might explain things better than me: https://studycli.org/zh-CN/chinese-characters/number-of-characters-in-chinese

Then you have radicals, with one word, transformed by different radicals to give a different meaning
1690939874550.png
Modern people still make mistakes when writing with the correct radical.
i'm not sure if this is so much the case with simplified chinese.

something more common is he/she/it, where there is a he for male, she for female, he/she for animal and he/she for gods(religious).

Chinese voice to Chinese subtitle AI is one major challenge,
to translate to English is another
to implement TT jargon is another
and you are right, I haven't even talked about idioms

and then as you said, words with multiple meaning.

there is just so many ways to say loop the ball in Chinese, and just trying to translate it manually to English, for me, some times is difficult.
 
This user has no status.
This user has no status.
Well-Known Member
Jul 2017
1,772
856
2,947
Tony, I think we have made a point that what isaacdl is trying to do is very difficult. I am a moderator on a Chinese servo control forum. Even Google translate would screw up translations. In the past I had to write what i wanted to say in English, then translate it to Chinese then translate that back into English to make sure nothing got screwed up. Sometimes I knew what was wrong and fixed it but often I had to ask someone that was born in China to translate better. Sometimes I would screw up and a forum member would help me out. However, I was told that I should copy both my english text and Chinese text in each post since their ability to understand English was better than my or Google translate's ability to understand Chinese. It is difficult and the Chinese don't get it right a lot of the time.
Above Tony posted how there are many symbols for yang. Sometimes Chinese people will use any symbol they now for the yang. The sound is right, but the written meaning is wrong.
 
This user has no status.
This user has no status.
Member
Feb 2023
10
5
17
I already knew that pull=loop and high profile mean a slow,high arc topspin, but the code was in a preliminary step so I focused on other things. I can make those changes so everything makes more sense easily. About the same topic, another word that I also find which appears in my code is hair. From context, I think the code want to mean something like spin or brush? Correct me if I'm wrong!

I wish I could learn chinese so I could make this translations by hand.
hello isaacdl, thank you for your initiative, which will surely benefit enthusiasts around the world. I'm Chinese myself and familiar with tt jargon. I understand English well. If you come across any language difficulty, do not hesitate to query.
 
This user has no status.
This user has no status.
Member
Mar 2022
26
16
54
hello isaacdl, thank you for your initiative, which will surely benefit enthusiasts around the world. I'm Chinese myself and familiar with tt jargon. I understand English well. If you come across any language difficulty, do not hesitate to query.
Hi silas6012, thanks! For example, the word I mentioned in this thread got translated as hair. From context, I think the code want to mean something like spin or brush. What do u think could be the real translation of it?
 
Last edited:
Top