Codeq NLP API Documentation

API Authentication

The first thing you need to do before start using Codeq's NLP API is to sign up to generate a User ID and User Key. These two pieces of information are required to make requests. Go ahead and sign up if you have not done it yet.

Get started »

API Calls

Once you have obtained a User ID and User Key, you can make requests to our API in the two following ways:

1. By sending a POST request to our endpoint: https://api.codeq.com/v1

2. By using our Python SDK.

Rate Limits

There is a rate limit of 1000 request per day per user. If you are interested on increasing this number please contact us.

HTTP Request

A POST request must be composed with the following parameters:

user_id: the id provided on the registration process.

user_key: the key also generated on the registration process.

text: the document to be analyzed.

pipeline (optional): a string indicating the specific NLP annotators to apply.


curl -X POST https://api.codeq.com/v1 \
    -d '{ \
        "user_id": "YOUR_USER_ID", \
        "user_key": "YOUR_USER_KEY", \
        "text": "This model is an expensive alternative with useless battery life." \
    }' \
            

The output of the request above will produce the following JSON:


{
  "sentences": [
    {
      "position": 0,
      "raw_sentence": "This model is an expensive alternative with useless battery life.",
      "tokens": ["This", "model", "is", "an", "expensive", "alternative", "with", "useless", "battery", "life", "."],
      "pos_tags": ["DT", "NN", "VBZ", "DT", "JJ", "NN", "IN", "JJ", "NN", "NN", "."],
      "speech_acts": ["Statement"],
      "sentiments": ["Negative"],
      ...
    }
  ]
}
            

Python SDK

It is also possible to make requests using our Python SDK (requires Python 3). To install it:


pip install codeq-nlp-api
            

Once installed, you can import the SDK and use it to initialize a client object. This client can be used to send requests to the API and retrieve a Document object, which encapsulates a list of Sentence objects with the analyzed information of the text:


from codeq_nlp_api import CodeqClient

client = CodeqClient(user_id="YOUR_USER_ID", user_key="YOUR_USER_KEY")

text = "This model is an expensive alternative with useless battery life."
document = client.analyze(text)

for sentence in document.sentences:
    print(sentence.sentiments)

"""
Output:
>> ['Negative']
"""
            

HTTP Status

Independently of the method you use to call our API, we will return a status code helpful to debug any error you may encounter. The following table summarizes the list of Status responses:


CODE TEXT DESCRIPTION
200 Ok The request was successfully processed.
400 Bad request We are not able to process your request, usually because a mal formed JSON.
401 Unauthorized The user key or user id you submitted is unknown.
404 Not found The usual "No idea what are you looking for".
429 Too many requests Your quota limit is done. Wait or talk to us to increase your quota.
500 Internal server error There is something wrong in our spaghetti code that we will fix soon.

Pipeline

By default, you can call the API and retrieve a text fully analyzed by our complete set of NLP Annotators. Or you can specify a custom pipeline depending on your needs.

For example, if you are only interested on getting the speech acts and emotion of a text, you can declare a "pipeline" key in the content of the POST request or as parameter in the Python SDK client object, and send as value a string with comma separated values indicating the Annotators you need:


client = CodeqClient(user_id="YOUR_USER_ID", user_key="YOUR_USER_KEY")

text = "This is getting very interesting!"
pipeline = "speechact, sentiment"
document = client.analyze(text, pipeline)

for sentence in document.sentences:
    print(sentence.speech_acts)
    print(sentence.sentiments)

"""
Output:
>> ['Statement']
>> ['Positive']
"""
            

Annotators

The following table shows the complete list of Annotators of our NLP API, including the keyword you can use as value of the pipeline parameter, as well as the description of the Annotators' output and its respective Python attribute:


KEY NAME DESCRIPTION
tokenize Tokenization Generates a list of words from raw text.
RETURNS: list
ATTR:document.tokens
ATTR:sentence.tokens
ssplit Sentence Segmentation Generates a list of sentences from a raw text.
RETURNS: list
ATTR:document.sentences
stopword Stopword Removal Produces a list of tokens after removing common stopwords from the text.
RETURNS: list
ATTR:sentence.tokens_filtered
stem Stemming Generates a list of the inflected forms of the tokens.
RETURNS: list
ATTR:sentence.stems
truecase True casing Produces a string with the true case of sentence tokens.
RETURNS: string
ATTR:sentence.truecase_sentence
detruecase Detrue casing Produces a string with the predicted original case of the tokens.
RETURNS: string
ATTR:sentence.detruecase_sentence
pos Part of Speech Tagging Generates a list containing the PoS-tag for each sentence token.
RETURNS: list
ATTR:sentence.pos_tags
lemma Lemmatization Generates a list containing the lemma for each sentence token.
RETURNS: list
ATTR:sentence.lemmas
parse Dependency parser Generates a list of dependencies in 3-tuples consisting of: head, dependent and relation. Head and dependent are in the format "token@@@position". Positions are 1-indexed, with 0 being the index for the root.
RETURNS: list
ATTR:sentence.dependencies
ner Named Entity Recognition Produces a list of named entities found in a sentence, containing the tokens of the entity, its type and its span positions.
RETURNS: list of tuples
ATTR:sentence.named_entities
speechact Speech Act Classifier Generates a list of tags indicating the predicted speech acts of a sentence.
RETURNS: list
ATTR:sentence.speech_acts
question Question Classifier Generates a list of tags indicating the predicted type of question, if a sentence is classified as such.
RETURNS: list
ATTR:sentence.question_types
sentiment Sentiment classifier Generates a list of values for each sentence indicating the predicted sentiment label.
RETURNS: list
ATTR:sentence.sentiments
emotion Emotion classifier Generates a list of values for each sentence indicating the predicted emotion label.
RETURNS: list
ATTR:sentence.emotions
sarcasm Sarcasm classifier Generates a label predicting if a sentence is sarcastic or not.
RETURNS: string
ATTR:sentence.sarcasm
coreference Coreference resolution Generates a list of resolved pronominal coreferences. Each coreference is a dictionary that includes: mention, referent, first_referent, where each of those elements is a tuple containing a coreference id, the tokens and the span of the item. Additionally, each coreference dict contains a coreference chain (all the ids of the linked mentions) and the first referent of a chain.
RETURNS: list of dicts
ATTR:sentence.coreferences
date Date resolution Generates a list of tuples for each sentence with all resolved date entities given a relative date (by default: today). The output includes the date entity, its tokens span and the resolved timestamp
RETURNS: list
ATTR:sentence.dates
task Task Extraction Generates different values including whether a sentence is predicted to be a task, and if so, it returns a list of tags indicating its predicted task type and a list of tuples indicating suggested task actions.
RETURNS: int
ATTR:sentence.is_task
RETURNS: list
ATTR:sentence.task_subclassification
ATTR:sentence.task_actions
compress Sentence compression Provides, where applicable, a shortened version of a sentence that gives its main point without extraneous clauses. It uses the output of the dependency parser Annotator to determine parts of the sentence that serve to modify, explain, or embellish the main points and strips them off, leaving only the core information provided by the sentence.
RETURNS: string
ATTR:sentence.compressed_sentence
summarize Summarization Generates an extractive summary with the most relevant sentences of the input text.
RETURNS: string
ATTR:document.summary
ATTR:document.compressed_summary
summarize_compress Summarization
with compression
Generates an extractive summary with the most relevant sentences of the input text in its compressed form, independently if the compress Annotator is specified in the pipeline or not.
RETURNS: string
ATTR:document.summary
ATTR:document.compressed_summary

Output Classes

The following table lists the output classes of the Deep Learning classifiers, as well as references to the output of some tex-processing tools.

ANNOTATOR OUTPUT CLASSES
Part of Speech Tagging Penn Treebank Reference
Dependency parser Stanford Dependencies version 3.5.2 Reference
Named Entity Recognition PER Person
LOC Location
ORG Organization
MISC Miscellaneous
DATE Date
MONEY Money
URL Url
PHONE Phone number
EMAIL Email address
TWITTERNAME Twitter name
TRACKINGNUMBER Tracking number
AIRLINECODE Airline code
AIRLINENAME Airline name
AIRPORTCODE Airport code
AIRPORTNAME Airport name
EMOJI Emoji
SMILIE Smilie
Speech Act Classifier Statement
Command/Request
Desire/Need
Commitment
Question
Other
Question Classifier Yes/No question (qy)
Wh- question (qw)
Open-ended question (qo)
Or question (qr)
Declarative question (d)
Tag question (g)
Rhetorical question (qh)
Sentiment Classifier Positive
Neutral
Negative
Emotion Classifier Anger
Disgust/Dislike
Fear
Joy/Like
Sadness
Surprise
Excitement
Angst
No emotion
Sarcasm Classifier Sarcastic
Non-sarcastic