Text Processing

Editor
  • Vim
  • Geany
  • gedit
  • Limetext
  • Sublime Text
  • Code Browser
  • Notepad++
Gettext
  • Gettext
  • GettextLog
  • Cstrings
PO files translation
  • i18nspector
  • intltool
Text Processing
  • Text Processing in Python
Syntax Trees
  • Linguistic Tree Constructor
  • libstree
Regular Expression
  • Regular-Expressions
  • PCRE
  • TRE
  • Libbitap
  • shwild (C)
Regex Tools
  • re2c
  • Regulex
  • Redet
  • RegexPolicyDaemon
  • Kodos(for test)
  • Regex Memo (jp)
PEG
  • pyPEG
Finite Automata
  • Libfa
Similarity
  • Similarity-utils
Text Summarizer
  • Open Text Summarizer
Text Classifier
  • Hiercat
  • Libtextcat
  • Wordtip (D-bus)
Reporting Engine
  • Papyrus
  • ReportLab (Python)
  • Rlib
  • RepTool
  • JasperReports (Java)
File Type
  • File Extensions Registry
  • Wotsits Format
Filetype Detection
  • file
  • FileType
Text Extractor
  • Universal text extractor
  • Libextractor
  • Flat file extractor
  • TXR (Pattern Matching)
Data Extractor
  • talend
Binary file decoding
  • Hachoir (Python)
  • bdec (Python)
Full-Text Search
  • Groonga (C)
  • Elasticsearch (Java)
    • Kibana
  • Sphinx
  • Xapian
    • Recoll
  • Hyper Estraier
  • Swish-e
  • Lucy (C)
  • Apache Lucene (Java)
  • PhiloLogic
  • Senna
  • OpenFTS
  • DDC-concordance
  • Multivalent (Java)
Full-Text Search in Python
  • PyLucene
  • Pyndexter
  • Wilma
Desktop Search
  • Beagle
  • Pinot (D-Bus)
  • DocFetcher (Java)
Search
  • searchMonkey
Natural Language Processing
  • Natural (Node.js)
  • NLTK (Python)
  • OpenNLP (Java)
Natual Language Classifier
  • natural-synaptic (JavaScript)
  • natural-brain (JavaScript)
Text Analysis
  • TAMS
Morphological Analyzer(JP)
  • MeCab
  • Juman
  • ChaSen
Syntactic Analysis(JP)
  • CaboCha
  • KNP
Linguistic Tree Constructor
  • LTC
Predictive Text
  • Presage
Document Management
  • Epiware
Text Database Engine
  • Emdros
Automatic Summarization
  • Shuca (Python)
Sentiment Analysis
  • sentiment
  • AFINN
Translation
  • Liblouis
  • Sanzang
  • Translate Toolkit
  • OmegaT+ (Java)
LanguageTool
  • LanguageTool
CSV
  • csvkit
  • json2csv

Unicode,Font

Unicode
  • Unicode
  • Libucd (Database)
  • ICU
  • Unicode Utilities
  • Liblinebreak
  • uni2ascii
  • Libuninum (to number)
  • Utf8proc
  • Libutf8
  • Utfout
  • Luit (Terminal)
  • FriBidi (Arabic,Hebrew)
Unicode Python
  • PyICU (ICU)
  • CJK Python
Multilingualization
  • m17n
  • openi18n
Remove BOM
  • bomstrip
Character Set
  • IANA character sets
  • CSets (mappming tables)
  • Java Encodings
Detect CharSet
  • Charset Detection
  • Universal Encoding Detector (Python)
Language Guessing
  • Libtextcat
Encoding Converter
  • Libiconv
  • FSF Localization
  • nkf (JP)
Japanese Char Code
  • x0213
  • Shift_JIS vs UNICODE
  • 日本の文字コード
  • JIS拡張漢字 (Unicode)
Font Library
  • FreeType2
  • Pango
  • ATSUI (Mac OS)
  • stsf
  • Xft (X11)
  • Xfstt (X11)
Font config
  • Libeasyfc
  • fonts-tweak-tool
  • Fontconfig
  • Fonty Python
  • Choosefont
Font Viewer
  • Gucharmap
  • fntsample
Font Converter
  • TTX/FontTools
  • otf2bdf(OpenType to BDF)
  • ttf2pt1
Font Editor
  • Birdfont
  • DoubleType (Java)
  • FontForge
  • AFE (Python)
  • gbdfed(Bitmap)
Fonts
  • Source Han Sans
  • Adobe Blank OpenType Font
  • Source Code Pro
  • Source Snas Pro
  • Tattoo script fonts
  • CJKUnifonts
  • DejaVu Fonts
  • PSF (Screen Font)
Japanese Fonts
  • IPAexfont
  • Tフォント
  • はんなり明朝
  • Takaoフォント
  • VLゴシックフォントファミリ
  • 春夏秋冬
  • 瀬戸フォント
  • 衡山毛筆フォント
  • 花園明朝
  • Corefonts (MS Fonts)
  • Oreilly cjkv

Index Page

Last Modified:2016-01-19