|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
reply to rw re "A question and a note"Reader comment on item: My Website Furthers Computer Science / Linguistics Submitted by John in Michigan, USA (United States), Jul 24, 2017 at 15:23 I have an undergraduate degree in Cognitive Science. My take on this article is, the authors like the fact that there's a lot of material, that it is cleanly and consistently formatted, with human translations of original articles into many languages, especially Middle-Eastern ones. They are probably using it to train or evaluate various machine learning programs that they use. For example, they are using POS taggers which Stanford defines as: "A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'noun-plural'." https://nlp.stanford.edu/software/tagger.shtml "a possible connection between linguistic syntax and web mark-up" appears to be Dr. Pipes' phrase, but since he's asking the comment community for help, perhaps he is paraphrasing something the authors told him in describing their use of his site? If I had to guess, I'd say this phrase means that the authors are investigating whether the format of an article on a Web page (the mark-up) can provide language syntax info (clues) that ordinarily wouldn't be available in a non-Web article (for example, in a printed article that was merely scanned into a machine-readable format). Does this provide any clarity? In my experience, we are still a long way from machine translation being able to replace a human translator! Perhaps the biggest recent advances in automated translation have come from Web indexing systems that avoid the classic, formal approach (i.e. translate something using rules of grammar, syntax, and semantics). Rather, they use pattern matching to scour the Web for existing, human translations (of phrases, sentences, paragraphs, or even whole articles) and match them up with the untranslated original. The software doesn't have to be smart enough to translate, it just has to be smart enough to recognize that someone has already published a translation on the Web. The software improves itself by collecting feedback from the user community about the quality of the translation, accepting suggestions, etc.
Dislike
Submitting....
Note: Opinions expressed in comments are those of the authors alone and not necessarily those of Daniel Pipes. Original writing only, please. Comments are screened and in some cases edited before posting. Reasoned disagreement is welcome but not comments that are scurrilous, off-topic, commercial, disparaging religions, or otherwise inappropriate. For complete regulations, see the "Guidelines for Reader Comments". Reader comments (11) on this item
|
Latest Articles |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
All materials by Daniel Pipes on this site: © 1968-2024 Daniel Pipes. daniel.pipes@gmail.com and @DanielPipes Support Daniel Pipes' work with a tax-deductible donation to the Middle East Forum.Daniel J. Pipes (The MEF is a publicly supported, nonprofit organization under section 501(c)3 of the Internal Revenue Code. Contributions are tax deductible to the full extent allowed by law. Tax-ID 23-774-9796, approved Apr. 27, 1998. For more information, view our IRS letter of determination.) |