Web services for predictive text analytics

Language

Language identification detects the language a text is written in. Different languages use different characters. For example, Russian (Кирилица), Chinese (汉字) and Arabic (العربية) are easy to distinguish. Languages that use the same characters (e.g., Latin alphabet, abc) often have cues that set them apart (e.g., é ↔ ë).

language code
Albanian sq
Arabic ar
Belarusianbe
Bulgarian bg
Chinese zh
Croatian hr
Czech cs
Danish da
Dutch nl
English en
Estonian et
Finnish fi
French fr
German de
Greek el
Hebrew he
Hindi hi
Hungarian hu
Icelandic is
Indonesianid
Italian it
Japanese ja
Korean ko
Latvian lv
Lithuanianlt
Macedonianmk
Malay ms
Norwegian no
Persian fa
Polish pl
Portuguesept
Romanian ro
Russian ru
Serbian sr
Slovak sk
Slovenian sl
Spanish es
Swahili sw
Swedish sv
Thai th
Turkish tr
Ukrainian uk
Vietnamesevi
Yiddish yi

Request

URL https://api.textgain.com/1/language
Parameter Value
q your text (max. 3,000 characters)
key your personal key

Response

The server returns the predicted language code and confidence, as a JSON string.

Q https://api.textgain.com/1/language?q=Loved+this+book!&key=***
A {"language": "en", "confidence": 0.95}

Genre

Genre classification predicts the type of text, based on its length, tone of voice and content.

genre length tone content
article+formalnames & dates
blog +casualpronouns (me, my)
mail formalinterjections (hi, thanks)
news formalnames & dates
review casualadjectives (good, bad)
status casualsmileys :-)

Request

URL https://api.textgain.com/1/genre
Parameter Value
q your text (max. 3,000 characters)
key your personal key

Response

The server returns the predicted genre and confidence, as a JSON string.

Q https://api.textgain.com/1/genre?q=Loved+this+book!&key=***
A {"genre": "review", "confidence": 0.95}

Part-of-speech tags

Part-of-speech tagging identifies sentence breaks and word types. Words have different roles depending on how they are used. For example, the word shop can be a noun (a shop, object) or a verb (to shop, action).

part-of-speech tag % example
noun NOUN30car, Google
verb VERB14be, have, do
punctuation PUNC11. ! ? : ; ,
preposition PREP10of, in, to, with
determiner DET 9a, an, the
adjective ADJ 7great, new, big
adverb ADV 4very, most
number NUM 4forty-two
pronoun PRON 3I, you, we, her
conjunction CONJ 2and, or, but
foreign X 2adieu
particle PRT 2's, to + VERB
interjectionINTJ 1yes, oh, wow
* Percentages are indicative for English

Request

URL https://api.textgain.com/1/tag
Parameter Value
q your text (max. 3,000 characters)
key your personal key
lang en, es, cs, da, de, fr, it, nl, pt, ru, sv

Response

The server returns a JSON string with a list of sentences. Each sentence is a list of phrases. Each phrase is a list of {word, tag} values:

Q https://api.textgain.com/1/tag?q=I+didn't+like+the+book.&lang=en&key=***
A
{"text": [
           [
             [ {"word": "I"   , "tag": "PRON"} ], 
             [ {"word": "did" , "tag": "VERB"}, 
               {"word": "n't" , "tag": "ADV" }, 
               {"word": "read", "tag": "VERB"} ], 
             [ {"word": "the" , "tag": "DET" }, 
               {"word": "book", "tag": "NOUN"} ], 
             [ {"word": "."   , "tag": "PUNC"} ]
           ]
         ], "confidence": 0.95}

Concepts

Concept extraction identifies keywords, key phrases and ‘named entities’ – names of persons, products, organizations, locations, dates, and so on. Keywords are nouns that appear more often in a text, and often at the start of a text. Named entities frequently start with a capital letter (e.g., Barack Obama). Concept extraction can be used to summarize a text, or to compare if two texts discuss similar topics for example.

concept example
named entityiPhone, Iraq, Obama
noun car, phone

Request

URL https://api.textgain.com/1/concepts
Parameter Value
q your text (max. 3,000 characters)
key your personal key
lang en, es, de, fr, it, nl
top 10 (optional)

Response

The server returns the top concepts, as a JSON string.

Q https://api.textgain.com/1/concepts?q=Loved+this+book!&lang=en&key=***
A {"concepts": ["book"]}

Sentiment

Sentiment analysis predicts whether a text is objective (fact) or subjective (opinion). Subjective text contains adverbs and adjectives with a positive or negative ‘polarity’ that capture the author’s personal opinion (e.g., an excellent opportunity or a bad product).

rating polarity example
★★★★☆+1.0awesome, excellent
★★★☆☆+0.0neutral, scientific
★★☆☆☆−1.0bad, expensive

Request

URL https://api.textgain.com/1/sentiment
Parameter Value
q your text (max. 3,000 characters)
key your personal key
lang en, es, ar, da, de, fr, it, ja, nl, no, pl, ru, sv, zh

Response

The server returns the predicted polarity and confidence, as a JSON string.

Q https://api.textgain.com/1/sentiment?q=Loved+this+book!&lang=en&key=***
A {"polarity": 1.0, "confidence": 0.70}

Age

Age prediction estimates whether a text is written by an adolescent or an adult. Online, adolescents use more informal language, including abbreviated utterances (omg, wow) and mood (awesome, lame). Adolescents tend to talk about school, parents, and partying. Adults tend talk about work, children, health, and use more complex sentence structures.

age range
adolescent25−
adult 25+

Request

URL https://api.textgain.com/1/age
Parameter Value
q your text (max. 3,000 characters)
key your personal key
lang en, es, de, fr, nl

Response

The server returns the predicted age range and confidence, as a JSON string.

Q https://api.textgain.com/1/age?q=OMG+coool&lang=en&key=***
A {"age": "25-", "confidence": 0.75}

Gender

Gender prediction estimates whether a text is written by a man or a woman. Statistically, women tend to talk more about people and relationships (family, friends), while men are more interested in objects and things (e.g., cars, games). As a result, women will use more personal pronouns (I, you, we) in a social context and men will use more determiners (a, an, the) and more quantifiers (one, many).

gender code
man m
womanf

Request

URL https://api.textgain.com/1/gender
Parameter Value
q your text (max. 3,000 characters)
key your personal key
lang en, es, da, de, fi, fr, it, nl, no, pl, pt, sv
by who wrote it? (optional)

Response

The server returns the predicted gender and confidence, as a JSON string.

Q https://api.textgain.com/1/gender?q=I+like+it&by=Amy&lang=en&key=***
A {"gender": "f", "confidence": 0.95}

Education

Education prediction estimates whether a text displays basic or advanced writing skills. Statistically, people with higher education will use more formal language, with more punctuation marks ( , ; : ), correct spelling and capitalization, longer words and less emoticons (cf. idk lol just talkin ☺☺☺).

education level code
high (MBA, PhD, ...)+
low

Request

URL https://api.textgain.com/1/education
Parameter Value
q your text (max. 3,000 characters)
key your personal key

Response

The server returns the predicted education level and confidence, as a JSON string.

Q https://api.textgain.com/1/education?q=AWSOME+PARTY+!!!&key=***
A {"education": "-", "confidence": 0.80}

Personality

Personality prediction estimates whether a text is written by an extraverted or an introverted person. Extraverts tend to be more sociable, assertive and playful, while introverts are more solitary, reserved and shy. As a result, extraverts will use we more often, and more positive adjectives and less formal language. Introverts will use I more often, and they employ a broader vocabulary.

trait code
extraversionE
introversionI

Request

URL https://api.textgain.com/1/personality
Parameter Value
q your text (max. 3,000 characters)
key your personal key
lang en, nl

Response

The server returns the predicted personality and confidence, as a JSON string.

Q https://api.textgain.com/1/personality?q=I+love+it!&lang=en&key=***
A {"personality": "E", "confidence": 0.60}