• Aug 09, 2018 News![CFP] The annual meeting of IJFCC Editorial Board, ICCTD 2019, will be held in Prague, Czech Republic during March 2-4, 2019.   [Click]
  • Aug 09, 2018 News!IJFCC Vol. 6, No. 1-No. 3 has been indexed by EI (Inspec).   [Click]
  • Dec 24, 2018 News!The papers published in Vol.7, No.1-No.2 have all received dois from Crossref.
General Information
    • ISSN: 2010-3751
    • Frequency: Bimonthly (2012-2016); Quarterly (Since 2017)
    • DOI: 10.18178/IJFCC
    • Editor-in-Chief: Prof. Mohamed Othman
    • Executive Editor: Ms. Cherry L. Chan
    • Abstracting/ Indexing: Google Scholar,  Crossref, Electronic Journals LibraryEI (INSPEC, IET), etc.
    • E-mail:  ijfcc@ejournal.net 
Prof. Mohamed Othman
Department of Communication Technology and Network Universiti Putra Malaysia, Malaysia
It is my honor to be the editor-in-chief of IJFCC. The journal publishes good papers in the field of future computer and communication. Hopefully, IJFCC will become a recognized journal among the readers in the filed of future computer and communication.
IJFCC 2016 Vol.5(1): 18-22 ISSN: 2010-3751
doi: 10.18178/ijfcc.2016.5.1.436

Character Mapping for Cross-Language

Mazin Al-Shuaili and Marco Carvalho
Abstract—Out-of-vocabulary words are a significant challenge for cross-language information retrieval. Names of people constitute a large portion of out-of-vocabulary words, as there are different methodologies to match names that are written in various languages. Some of the methods convert names to phonetic codes, such as Soundex, or transliterate names from one language to another. We propose a technique to map characters automatically from different languages into English, without human interference and without prior knowledge of the language. This technique can provide a statistical or phonetic model that can be used later for name comparisons or named transliterations into a cross-language. The method also generates Soundex codes for the source language based on English Soundex codes. We implement this technique for five languages: Arabic, Russian, Urdu, Hindi, and Persian. Five Soundex tables are provided as the result of this technique.

Index Terms—CLIR, data linkage, IR, name matching.

The authors are with the Florida Institute of Technology, Melbourne, FL 32901 USA (e-mail: malshuaili1994@my.fit.edu, mcarvalho@cs.fit.edu).


Cite: Mazin Al-Shuaili and Marco Carvalho, "Character Mapping for Cross-Language," International Journal of Future Computer and Communication vol. 5, no. 1, pp. 18-22, 2016.

Copyright © 2008-2018. International Journal of Future Computer and Communication. All rights reserved.
E-mail: ijfcc@ejournal.net