• Jan 05, 2017 News![CFP] 2017 the annual meeting of IJFCC Editorial Board, ICCTD 2017, will be held in Paris, France during March 20-22, 2017.   [Click]
  • Mar 24, 2016 News! IJFCC Vol. 4, No. 4 has been indexed by EI (Inspec).   [Click]
  • Jun 28, 2017 News!Vol.6, No.3 has been published with online version.   [Click]
General Information
    • ISSN: 2010-3751
    • Frequency: Bimonthly (2012-2016); Quarterly (Since 2017)
    • DOI: 10.18178/IJFCC
    • Editor-in-Chief: Prof. Mohamed Othman
    • Executive Editor: Ms. Nancy Y. Liu
    • Abstracting/ Indexing: Google Scholar, Engineering & Technology Digital Library, and Crossref, DOAJ, Electronic Journals LibraryEI (INSPEC, IET).
    • E-mail:  ijfcc@ejournal.net 
Prof. Mohamed Othman
Department of Communication Technology and Network Universiti Putra Malaysia, Malaysia
It is my honor to be the editor-in-chief of IJFCC. The journal publishes good papers in the field of future computer and communication. Hopefully, IJFCC will become a recognized journal among the readers in the filed of future computer and communication.
IJFCC 2016 Vol.5(1): 18-22 ISSN: 2010-3751
doi: 10.18178/ijfcc.2016.5.1.436

Character Mapping for Cross-Language

Mazin Al-Shuaili and Marco Carvalho
Abstract—Out-of-vocabulary words are a significant challenge for cross-language information retrieval. Names of people constitute a large portion of out-of-vocabulary words, as there are different methodologies to match names that are written in various languages. Some of the methods convert names to phonetic codes, such as Soundex, or transliterate names from one language to another. We propose a technique to map characters automatically from different languages into English, without human interference and without prior knowledge of the language. This technique can provide a statistical or phonetic model that can be used later for name comparisons or named transliterations into a cross-language. The method also generates Soundex codes for the source language based on English Soundex codes. We implement this technique for five languages: Arabic, Russian, Urdu, Hindi, and Persian. Five Soundex tables are provided as the result of this technique.

Index Terms—CLIR, data linkage, IR, name matching.

The authors are with the Florida Institute of Technology, Melbourne, FL 32901 USA (e-mail: malshuaili1994@my.fit.edu, mcarvalho@cs.fit.edu).


Cite: Mazin Al-Shuaili and Marco Carvalho, "Character Mapping for Cross-Language," International Journal of Future Computer and Communication vol. 5, no. 1, pp. 18-22, 2016.

Copyright © 2008-2016. International Journal of Future Computer and Communication. All rights reserved.
E-mail: ijfcc@ejournal.net