Load data
You will be able to use Thinknum API by using python package.
Installation
pip install thinknum
Query
Import library.
from thinknum import Query
To authenticate, you must first obtain a client_id and client_secret from your assigned Thinknum account manager. Your client_secret must not be shared or exposed via publicly accessible resources (such as browser client-side scripting).
q = Query(
client_id='Your client id',
client_secret='Your client secret'
)
The default timeout is 180 seconds. If you need to change timeout seconds, you can configure it with the timeout argument.
q = Query(
client_id='Your client id',
client_secret='Your client secret',
timeout=300
)
The timeout error won't happen if you set it to None.
q = Query(
client_id='Your client id',
client_secret='Your client secret',
timeout=None
)
If you need to use a proxy, you can configure it with the proxies argument.
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
q = Query(
client_id='Your client id',
client_secret='Your client secret',
proxies=proxies
)
Requests can ignore verifying the SSL certficate if you set verify to False. By default, verify is set to True.
q = Query(
client_id='Your client id',
client_secret='Your client secret',
verify=False
)
You will get a list of datasets, each of which has the dataset id and its display_name.
q.get_dataset_list()
You will get dataset's metadata.
q.get_dataset_metadata(dataset_id='job_listings')
It's possible to limit the dataset list to a specific ticker by specific a "ticker" query parameter. For example, getting all datasets available for Apple Inc:
q.get_ticker_dataset_list(query='nasdaq:aapl')
You can search for tickers.
q.get_ticker_list(query="tesla")
You can also search for tickers of particular dataset.
q.get_ticker_list(query="tesla", dataset_id='job_listings')
You can retrieve data for specific dataset and tickers with various filters. To retrieve data lulu's job listings in 2020, an example request is:
q.add_ticker('nasdaq:lulu') # Add ticker
q.add_filter(
column='as_of_date',
type='>=',
value=["2020-01-01"]
)
q.add_filter(
column='as_of_date',
type='<=',
value=["2020-12-31"]
)
q.add_sort(
column='as_of_date',
order='asc'
) # Add Sort
q.get_data(dataset_id='job_listings') # Retrieve data
You can retrieve data with OR filters. To retrieve lulu's job listings which title has sales or description has sales in 2020, an example request is:
q.add_ticker('nasdaq:aapl') # Add ticker
q.add_filter(
column='as_of_date',
type='>=',
value=["2020-01-01"]
)
q.add_filter(
column='as_of_date',
type='<=',
value=["2020-12-31"]
)
root_condition = q.add_condition(
match='any'
)
q.add_filter(
column='title',
type='...',
value='sales',
condition=root_condition
)
q.add_filter(
column='description',
type='...',
value='sales',
condition=root_condition
)
q.get_data(dataset_id='job_listings') # Retrieve data
You can retrieve data with more complicated filters. To retrieve lulu's sales job in 2020 or marketing job in 2021:
q.add_ticker('nasdaq:aapl') # Add ticker
q.add_filter(
column='as_of_date',
type='>=',
value=["2020-01-01"]
)
q.add_filter(
column='as_of_date',
type='<=',
value=["2020-12-31"]
)
root_condition = q.add_condition(
match='any',
)
c1 = q.add_condition(
match='all',
condition=root_condition
)
q.add_filter(
column='title',
type='...',
value='sales',
condition=c1
)
q.add_filter(
column='as_of_date',
type='>=',
value=["2020-01-01"],
condition=c1
)
q.add_filter(
column='as_of_date',
type='<=',
value=["2020-12-31"],
condition=c1
)
c2 = q.add_condition(
match='all',
condition=root_condition
)
q.add_filter(
column='title',
type='...',
value='marketing',
condition=c2
)
q.add_filter(
column='as_of_date',
type='>=',
value=["2021-01-01"],
condition=c2
)
q.add_filter(
column='as_of_date',
type='<=',
value=["2021-12-31"],
condition=c2
)
q.get_data(dataset_id='job_listings') # Retrieve data
Please note that the maximum depth of condition is two.
You can also specify start and limit. The default values are 1
and 100000
.
If you want to raise an exception when data is truncated, you can add except_on_truncate
as True
. It is False
by default.
q.get_data(
dataset_id='job_listings',
start=1,
limit=1000,
except_on_truncate=True
)
Sometimes you only need get aggregated results for a dataset. In such cases you can retrieve them through the addGroup
and addAggregation
functions.
q.add_ticker('nasdaq:lulu') # Add ticker
q.add_group(column='as_of_date') # Add group
q.add_aggregation(
column='dataset__entity__entity_ticker__ticker__ticker',
type='count'
) # Add aggregation
q.add_sort(
column='as_of_date',
order='asc'
) # Add sort
q.get_data(dataset_id='job_listings')
There a few functions that you can apply to queries to gather even more insight into the data. You can retrieve a listing of the available functions in a dataset with the getDatasetMetadata
function. For example, there is nearby
function for store
dataset.
q.add_ticker('nasdaq:lulu')
q.add_function(
function='nearby',
parameters={
"dataset_type": "dataset",
"dataset": "store",
"tickers":["nyse:ua"],
"entities": [],
"distance": 5,
"is_include_closed": False
}
)
q.get_data(dataset_id='store')
Also, you can apply nearest
function to store
dataset like the following code.
q.add_ticker('nasdaq:lulu')
q.add_function(
function='nearest',
parameters={
"dataset_type": "dataset",
"dataset": "store",
"tickers":["nyse:ua"],
"entities": [],
"ranks": [1],
"is_include_closed": False
}
)
q.get_data(dataset_id='store')
Also, you can apply sales
function to Car Inventory
dataset like the following code.
q.add_ticker('nyse:kmx')
q.add_function(
function='sales',
parameters={
"lookahead_day_count": 2,
"start_date": "2020-01-01",
"end_date": "2020-01-07"
}
)
q.get_data(dataset_id='car_inventory')
Also, you can reset entire query.
q.reset_query()
Also, you can reset tickers.
q.reset_tickers()
Also, you can reset filters.
q.reset_filters()
Also, you can reset functions.
q.reset_functions()
Also, you can reset groups.
q.reset_groups()
Also, you can reset aggregations.
q.reset_aggregations()
Also, you can reset sorts.
q.reset_sorts()
History
Import library.
from thinknum import History
Like the Query
library, you must authenticate to utilize History
library.
h = History(
client_id='Your client id',
client_secret='Your client secret'
)
If you need to use a proxy, you can configure it with the proxies argument.
proxies = {
"http": "http://10.10.1.10:3128",
"https": "http://10.10.1.10:1080",
}
h = History(
client_id='Your client id',
client_secret='Your client secret',
proxies=proxies
)
Requests can ignore verifying the SSL certficate if you set verify to False. By default, verify is set to True.
h = History(
client_id='Your client id',
client_secret='Your client secret',
verify=False
)
To retrieve a list of available history for a dataset:
h.get_history_list(dataset_id='store')
You can view the metadata for the historical file:
h.get_history_metadata(
dataset_id='store',
history_date='2020-03-09'
)
To download a CSV of the historical data:
h.download(
dataset_id='store',
history_date='2020-03-09'
)
You can specify download path:
h.download(
dataset_id='store',
history_date='2020-03-09',
download_path='/Users/sangwonseo/Downloads'
)
You can see download url of the historical data:
h.get_url(
dataset_id='store',
history_date='2020-03-09'
)
License
MIT
For more details
Updated almost 2 years ago