Load data

You will be able to use Thinknum API by using python package.

Installation

pip install thinknum

Query

Import library.

from thinknum import Query

To authenticate, you must first obtain a client_id and client_secret from your assigned Thinknum account manager. Your client_secret must not be shared or exposed via publicly accessible resources (such as browser client-side scripting).

q = Query(
    client_id='Your client id',
    client_secret='Your client secret'
)

The default timeout is 180 seconds. If you need to change timeout seconds, you can configure it with the timeout argument.

q = Query(
    client_id='Your client id',
    client_secret='Your client secret',
    timeout=300
)

The timeout error won't happen if you set it to None.

q = Query(
    client_id='Your client id',
    client_secret='Your client secret',
    timeout=None
)

If you need to use a proxy, you can configure it with the proxies argument.

proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}

q = Query(
    client_id='Your client id',
    client_secret='Your client secret',
    proxies=proxies
)

Requests can ignore verifying the SSL certficate if you set verify to False. By default, verify is set to True.

q = Query(
    client_id='Your client id',
    client_secret='Your client secret',
    verify=False
)

You will get a list of datasets, each of which has the dataset id and its display_name.

q.get_dataset_list()

You will get dataset's metadata.

q.get_dataset_metadata(dataset_id='job_listings')

It's possible to limit the dataset list to a specific ticker by specific a "ticker" query parameter. For example, getting all datasets available for Apple Inc:

q.get_ticker_dataset_list(query='nasdaq:aapl')

You can search for tickers.

q.get_ticker_list(query="tesla")

You can also search for tickers of particular dataset.

q.get_ticker_list(query="tesla", dataset_id='job_listings')

You can retrieve data for specific dataset and tickers with various filters. To retrieve data lulu's job listings in 2020, an example request is:

q.add_ticker('nasdaq:lulu') # Add ticker
q.add_filter(
    column='as_of_date',
    type='>=',
    value=["2020-01-01"]
)
q.add_filter(
    column='as_of_date',
    type='<=',
    value=["2020-12-31"]
)
q.add_sort(
    column='as_of_date',
    order='asc'
)   # Add Sort
q.get_data(dataset_id='job_listings')    # Retrieve data

You can retrieve data with OR filters. To retrieve lulu's job listings which title has sales or description has sales in 2020, an example request is:

q.add_ticker('nasdaq:aapl') # Add ticker
q.add_filter(
    column='as_of_date',
    type='>=',
    value=["2020-01-01"]
)
q.add_filter(
    column='as_of_date',
    type='<=',
    value=["2020-12-31"]
)
root_condition = q.add_condition(
    match='any'
)
q.add_filter(
    column='title',
    type='...',
    value='sales',
    condition=root_condition
)
q.add_filter(
    column='description',
    type='...',
    value='sales',
    condition=root_condition
)
q.get_data(dataset_id='job_listings')    # Retrieve data

You can retrieve data with more complicated filters. To retrieve lulu's sales job in 2020 or marketing job in 2021:

q.add_ticker('nasdaq:aapl') # Add ticker
q.add_filter(
    column='as_of_date',
    type='>=',
    value=["2020-01-01"]
)
q.add_filter(
    column='as_of_date',
    type='<=',
    value=["2020-12-31"]
)
root_condition = q.add_condition(
    match='any',
)
c1 = q.add_condition(
    match='all',
    condition=root_condition
)
q.add_filter(
    column='title',
    type='...',
    value='sales',
    condition=c1
)
q.add_filter(
    column='as_of_date',
    type='>=',
    value=["2020-01-01"],
    condition=c1
)
q.add_filter(
    column='as_of_date',
    type='<=',
    value=["2020-12-31"],
    condition=c1
)

c2 = q.add_condition(
    match='all',
    condition=root_condition
)
q.add_filter(
    column='title',
    type='...',
    value='marketing',
    condition=c2
)
q.add_filter(
    column='as_of_date',
    type='>=',
    value=["2021-01-01"],
    condition=c2
)
q.add_filter(
    column='as_of_date',
    type='<=',
    value=["2021-12-31"],
    condition=c2
)
q.get_data(dataset_id='job_listings')    # Retrieve data

Please note that the maximum depth of condition is two.

You can also specify start and limit. The default values are 1 and 100000.

If you want to raise an exception when data is truncated, you can add except_on_truncate as True. It is False by default.

q.get_data(
    dataset_id='job_listings', 
    start=1,
    limit=1000,
    except_on_truncate=True
)

Sometimes you only need get aggregated results for a dataset. In such cases you can retrieve them through the addGroup and addAggregation functions.

q.add_ticker('nasdaq:lulu') # Add ticker
q.add_group(column='as_of_date') # Add group
q.add_aggregation(
    column='dataset__entity__entity_ticker__ticker__ticker',
    type='count'
)   # Add aggregation
q.add_sort(
    column='as_of_date',
    order='asc'
)   # Add sort
q.get_data(dataset_id='job_listings')

There a few functions that you can apply to queries to gather even more insight into the data. You can retrieve a listing of the available functions in a dataset with the getDatasetMetadata function. For example, there is nearby function for store dataset.

q.add_ticker('nasdaq:lulu')
q.add_function(
    function='nearby',
    parameters={
        "dataset_type": "dataset",
        "dataset": "store",
        "tickers":["nyse:ua"],
        "entities": [],
        "distance": 5,
        "is_include_closed": False
    }
)
q.get_data(dataset_id='store')

Also, you can apply nearest function to store dataset like the following code.

q.add_ticker('nasdaq:lulu')
q.add_function(
    function='nearest',
    parameters={
        "dataset_type": "dataset",
        "dataset": "store",
        "tickers":["nyse:ua"],
        "entities": [],
        "ranks": [1],
        "is_include_closed": False
    }
)
q.get_data(dataset_id='store')

Also, you can apply sales function to Car Inventory dataset like the following code.

q.add_ticker('nyse:kmx')
q.add_function(
    function='sales',
    parameters={
        "lookahead_day_count": 2,
        "start_date": "2020-01-01",
        "end_date": "2020-01-07"
    }
)
q.get_data(dataset_id='car_inventory')

Also, you can reset entire query.

q.reset_query()

Also, you can reset tickers.

q.reset_tickers()

Also, you can reset filters.

q.reset_filters()

Also, you can reset functions.

q.reset_functions()

Also, you can reset groups.

q.reset_groups()

Also, you can reset aggregations.

q.reset_aggregations()

Also, you can reset sorts.

q.reset_sorts()

History

Import library.

from thinknum import History

Like the Query library, you must authenticate to utilize History library.

h = History(
    client_id='Your client id',
    client_secret='Your client secret'
)

If you need to use a proxy, you can configure it with the proxies argument.

proxies = {
  "http": "http://10.10.1.10:3128",
  "https": "http://10.10.1.10:1080",
}

h = History(
    client_id='Your client id',
    client_secret='Your client secret',
    proxies=proxies
)

Requests can ignore verifying the SSL certficate if you set verify to False. By default, verify is set to True.

h = History(
    client_id='Your client id',
    client_secret='Your client secret',
    verify=False
)

To retrieve a list of available history for a dataset:

h.get_history_list(dataset_id='store')

You can view the metadata for the historical file:

h.get_history_metadata(
    dataset_id='store',
    history_date='2020-03-09'
)

To download a CSV of the historical data:

h.download(
    dataset_id='store',
    history_date='2020-03-09'
)

You can specify download path:

h.download(
    dataset_id='store',
    history_date='2020-03-09', 
    download_path='/Users/sangwonseo/Downloads'
)

You can see download url of the historical data:

h.get_url(
    dataset_id='store',
    history_date='2020-03-09'
)

License

MIT

For more details

https://pypi.org/project/thinknum/