Menu
People API

Analytics Query API

The API returns analytics for requested variables and query in JSON format.

The api supports more than 30 variables including gender, age, location of permanent residency (coordinates, postalcode, area based bric data, statistical predictions e.g. on incomes and education, and various different classifications.

  • You can use all variables as filters. For instance you can query how many 20-30 year old person lives in Helsinki.
  • You can get statistical profile for the (filtered) result set: E.g you can find out, how many percents of the person in the query result set are men, or how meny persent of the persons are in the highest decile of income.
API endpoints
  • https://api-test.bisnode.fi/people/analytics/v2/query - test environment

In order to use the API you must have a valid api key on test environment and it must be sent in the request custom header element (named x-api-key). If the api key value is valid then API request will be successful (http status code 200), otherwise it will fail (http status code 401). All API requests must be made over HTTPS.

cURL request example

curl 'https://api-test.bisnode.fi/people/analytics/v2/query' -X POST -d '{ "variables": "all", "filters": [] }' -H 'x-api-key: 1234567890ABC' -H 'Content-Type: application/json'

Getting started

The most simple query is following

POST https://api-test.bisnode.fi/people/analytics/v2/query

You don't need to send body. You get following response:

{
    "took": 297,
    "count": 3755716,
    "householdCount": 2283739,
    "analytics": {}
}

Following query returns income profile for 18-30 year old persons:

POST https://api-test.bisnode.fi/people/analytics/v2/query
{
  "variables": ["pred_person_income"],
  "filters": [
    {
        "variable": "age",
        "min": 18,
        "max": 30
    }
  ]
}

Response:

{
    "took": 156,
    "count": 628228,
    "householdCount": 513535,
    "analytics": {
        "pred_person_income": {
            "10": 0.1386585899635462,
            "01": 37.514496851427644,
            "02": 16.152288026479315,
            "03": 16.16586634462091,
            "04": 11.43134642602692,
            "05": 7.0302141540388305,
            "06": 4.557043313237422,
            "07": 3.5354745542318025,
            "08": 2.355598580187157,
            "09": 1.119013159786453
        }
    }
}

From response you see that 37,5 precent of 18-30 year old person are in the bottom decile of income, and only 0,1% are in topmost decile. Young persons are relatively poor - which is not surprice.

Optional query string parameters

  • pretty (switch): Returns formated JSON as text/plain. Otherwise result is unformated and content type is application/json.

How to build query

The format of request body is following:

{
  variables: <"all", "count", [<list of variables>]>,
  filters: [ <List of filters> ]
  exclude: [ <List of filter> ]
}
  • variables: Specify variables for witch you want to retrieve statical profile. Following values are supporeted:
  • "count" (default) Retrieve only the number of persons and households for which the query matches.
  • "all": Retrieve data for all available variables.
  • [<list of variable names>] (e.g. ["age"]): Retrieve analytics for specified variables.

Notice: For performance reasons you should request only the data you need. If you query all data, the request will take more than over 2 seconds. If you request only "count", the request should take approzimately one hundred milliseconds.

  • filters: List of filters that must be true for all persons included to the result. Filter representation vary by the type of variable - see listing of variable below.
  • You can create a combined filter by using unionGroup attribute: A person will be included to the result, if any filter having same unionGroup is true. See examples 3 and 4.
  • exclude. List of filters that must be false for all person included to the result.
  • You can use combined filters also as an exclude filter.

Filter types

Each variable have a filter type. Variables and their filter types are listed below, and you can also retreive them form schema API endpoint.

Common attributes for all filters:
  • variable is variable identifier. Mandatory attribute.
  • unionGroup Create combined filters by using unionGroups. Mathematically speaking: A commination of filters yields intersection of their result sets by default. The result sets of filters having same unionGroup are first merged (union) before the intersection from the other filters is calculated. See example 4.
Filter type: scalar-range

Scalar range filter specifies value range for a variable. Range start (min) and end (max) points are included into range.

Scalar range filter must have following attributes (in addition to the mandatory common filters):

  • min is minumum value. Mandatory for range filters. Ignored for others.
  • max is maximum value for a variable. Mandatory for range filters. Ignored for others.

Example:

{
        "variable": "age",
        "min": 18,
        "max": 30
    }
Filter type: has-any

A person must have at least one of the listed values in the filter object.

Has any filter must have following attribute (in addition to the mandatory common filters):

  • value is an array of valid values for variable.

Example:

{
   "variable": "bric_residential_area",
   "value": [1, 2]
}
Filter type: has-any-or-null

If person have value for variable, it must be one of the listed values in the filter object. Filter returns true, if value is missing.

Has any filter must have following attribute (in addition to the mandatory common filters):

  • value is an array of valid values for variable.

Example:

 {
   "variable": "robinson",
   "value": ["tele"]
}
Filter type: geo-range

Geo range filter specifes a WGS84 coordinate and range (in meters). A person must be within range from given coordinate. The range end point is included in the result set.

Geo range filter must have following attribute (in addition to the mandatory common filters):

  • x is longitude of the coordinate.
  • y is latitude in the coordinate.
  • range is maximum distance (meters) from specified coordinate.

Example:

     {
        "variable": "location",
        "x": 60.1926622,
        "y": 24.9442899,
        "radius": 10000
    }

Response

The format of response is following

 {
    "took": <number>,
    "count": <number>,
    "householdCount": <number>,
    "error": <string>
    "analytics": {
        <variable name>: <profile>
    }
}
  • took tell how long data query took. This is used for monitoring and testing of API endpoint.
  • count number of persons the query applies.
  • householdCount number of household the query applies. A household is included if at least one person the query matches lives in the household. Household count is accurate if count is under 40000. If count is more than 40 000, householdCound is estimate. The error should be less than 2%.
  • error if there are less than 10 hits, analytics is not available. Instead of data this attribute have an error message.
  • analytics: contain analytics profile for each variable you have requested. The meaning of data depends on the analytics type of the variable.

Analytics types

Each variable have a analytics type. See analtytics type for filter from the table below, or get it from schema API endpoint.

There are 3 different analtytics types:

  • classification: Each person belongs to one classes - or does not have a class. E.g. age, gender. In some cases, a person may belong many classes, and thus sum of classes is not necessarily 100%. Relative portions are calculated from person having classification. Missing values are simply ignored.
  • decile: Population is ordered by a variable and split to 10 groups. Most predictions (prefix: pred_) have this type of analytics profile. While in theory, each deciles for all data should be exaclty 10%, in practice this is not the case. There are two reasons for this: 1) Many person in the data set have same value for a variable and persons having same value are classified to the same decile. 2) Deciles are always calculated from largest available dataset. Database contains 3,7 million consumers out of total of ~4,5 million consumers. If possible and reasonable, deciles for a variable are calculated from 4,5 million consumers dataset.
  • top N. Profile contains Top N classes and their relative portion from all person who have the variable. The sum or portions is often less than 100%. There are three variable that have this type of analytics profile: postal_code, municipality_code and county_code.
  • none. Analytics profile is not available There are two variable that won't have any analytics. Location (coordinates) and robinson.

All returned variable data points are percents counted from the persons in the result set having requested variable. Not all items in database have data for all variables. If an person in database doesn't have a value, ther person is simply ignored from analytics profile related to the variable.

This is good to keep in mind especially, when you use area based bric* variables. Bric cannot be calculated for a person who lives in too sparsely populated area. If you want to exclude those persons who does not have value, use a filter containing all possible values for a variable.

E.g. following query would return bric_housing_type profile for all person having it in database:

 {
    "variables": ["bric_housing_type"],
    "filters": [
        {
            "variable": "bric_housing_type",
            "value": [1,2]
        }
    ]
}

Response is:

{
    "took": 312,
    "count": 3293930,
    "householdCount": 2009332,
    "analytics": {
        "bric_housing_type": {
            "1": 59.24284991382449,
            "2": 40.75715008617551
        }
    }
}

As you see, there are 3.2 million person who lives enough densely populated area for this type of statistical metrics. Without filter youwill get 3.7 million hits - and the same analytics profile.

 

Example 1 - Count

Problem:

A pizzeria entrepreneur want to know how many potential customers lives within 1km from his restaurant located at GPS location (60.157299, 24.8407841).

Solution:

Following query will fulfill the requirements.

{
    "variables": "count",
    "filters": [
        {
         "variable": "location",
         "x": 60.157299,
         "y": 24.8407841,
         "radius": 1000
        }
    ]
}

If you don't need analytics data uses set variables = "count" or don't specify variables at all. In this case, its enough to use just one filter (locations).

Response is currently (18.10.2016):

{"took":140,"count":113,"householdCount":92,"analytics":{}}

Count 113 is number of people who lives within 1km from the mentioned location. E.g. workplaces are not included.

Example 2 - Exclude

Problem:

A pizzeria entrepreneur wants to how many potential customers lives within 1-2 km from his restaurant located at GPS location (60.157299, 24.8407841). He wants to advertise the restaurant them. People who lives within less 1 km seem to know the restaurant, and he doesn't want to waste his marketing budget for them. He wants to know how many flyers he must print, if he wanted to share them to every household.

Solution:

Following query fulfills the requirements:

{
    "variables": "count",
    "filters": [
        {
         "variable": "location",
         "x": 60.157299,
         "y": 24.8407841,
         "radius": 2000
        }
    ],
    "exclude": [
        {
         "variable": "location",
         "x": 60.157299,
         "y": 24.8407841,
         "radius": 1000
        }
    ]
}

Currently its no filter where you could set distance range. However, you can query all people within 2m and then remove people within 1 km by using exclude attribute.

Response is currently (18.10.2016):

{"took":120,"count":5097,"householdCount":4331,"analytics":{}}

Correct number is ~4400 is number of people who lives within 1km from the mentioned location. However, that's not actually the number the entrepreneur want. There quite a number of people who don't want direct marketing. You can filter them out by using robinson filter:

{
    "variables": "count",
    "filters": [
        {
         "variable": "location",
         "x": 60.157299,
         "y": 24.8407841,
         "radius": 2000
        },
        {
            "variable": "robinson",
            "value": ["tele"]
        }
    ],
    "exclude": [
        {
          "variable": "location",
          "x": 60.157299,
          "y": 24.8407841,
          "radius": 1000
        }
    ]
}

Semantics of robinson filter is slightly tricky. The query above removes people, who have only tele robinson or no robinson at all. Thus, using it in exclude would produce wront result.

Response:

{"took":141,"count":4833,"householdCount":4127,"analytics":{}}

4127 households have not have direct marketing prohibition in Bisnode database in the given area.

Example 3 - Union Groups and combined filters

Problem:

You want to estimate business potential for an expensive service targeted to young persons. Your shop is located in Lauttasaari. Thus, you want to see income prediction for all young persons (less than 30 years), who live in Helsinki or within 10km from the location of your shop.

Solution:

Following query will fulfill the requirements.

{
  "variables": ["pred_person_income"],
  "filters": [
    {
     "variable": "municipality_code", 
     "value": [91],
     "unionGroup": "location"
    },
    {
     "variable": "location",
     "x": 60.157299,
     "y": 24.8407841,
     "radius": 10000,
     "unionGroup": "location"
    },
    {
     "variable": "age", 
     "min": 18,
     "max": 30
    }
  ]
}

This query calculates pred_person_income profile with conditions:

  • (filter 1) person must live in Helsinki (Municipality code of Helsinki is 91, you can find it out from schema API endpoint) OR (filter 1) his/her home must be within 10km from GPS location (60.157299, 24.8407841). UnionGroup attribute is used to calculate first union of all person for whom either of the conditions 1 or 2 is true.
  • (filter C) person must be 18-30 year old.

Query return currently (18.10.):

{
    "took":125,
    "count":109765,
    "householdCount":92804,
    "analytics":{
        "pred_person_income":{
            "10":0.32618825722274,
            "01":27.392504065932055,
            "02":13.284176671600607,
            "03":14.352283317800559,
            "04":14.142134020430166,
            "05":11.728158178462438,
            "06":7.706083365312574,
            "07":5.536063446813954,
            "08":3.4217787767483507,
            "09":2.1106298996765527
        }
    }
}

It's not surpricing that young people's income don't belong to the highest deciles. On the requested area theres approximately 110 000 consumers. Most of them probably don't belong to the target group.

Notice: Get proper postal code, municipality codes and county codes from schema endpoint.

Variables

Variable name / display name (en) Filter type / Analytics type Description / Filter example
age (Age) scalar range / classification Usage example: {"min":18,"max":81,"variable":"age"}
bric_education_level (Bric - Education level) has any / classification Estimated education level (+18 years old persons) in residential area is shown in classes 1-3. Usage example: {"value":[1,3],"variable":"bric_education_level"}
bric_home_ownership (Bric - House ownership) has any / classification Estimated house ownership in residential area is shown in classes 1-2. Usage example: {"value":[1,2],"variable":"bric_home_ownership"}
bric_housing_type (Bric - House type) has any / classification Estimated house type in residential area is shown in classes 1-2. Usage example: {"value":[1,2],"variable":"bric_housing_type"}
bric_life_stage (Bric - Life stage) has any / classification Estimated life stage (+18 years old persons) in residential area is shown in classes 1-4. Usage example: {"value":[1,4],"variable":"bric_life_stage"}
bric_payment_default_risk (Bric - Payment risk) has any / classification Estimated paymet risk (+18 years old persons) in residential area is shown in classes 1-4. Usage example: {"value":[1,4],"variable":"bric_payment_default_risk"}
bric_purchasing_power (Bric - Purchasing power) has any / classification Estimated purchasing power (+18 years old persons) in residential area is shown in classes 1-4. Usage example: {"value":[1,4],"variable":"bric_purchasing_power"}
bric_residential_area (Bric - Residential area) has any / classification Estimated residential area is shown in classes 1-5. Usage example: {"value":[1,2],"variable":"bric_residential_area"}
counties (County) / top N Usage example: {"value":[2],"variable":"counties"}
gender (Gender) has any / classification Usage example: {"value":["F"],"variable":"gender"}
has_phone_numbers (Has phonenumber(s)) has any / classification Has person at least one phone number in Bisnode phonenumber datasource Usage example: {"value":"T","variable":"has_phone_numbers"}
household_size (Household size) has any / classification Number of adults (over 18) living in same household. Usage example: {"value":[1,2],"variable":"household_size"}
language_preference (Preferred language) has any / classification Usage example: {"value":["fi","sv","other"],"variable":"language_preference"}
location (Residence location (GPS)) geo range / none Usage example: {"x":60.5,"y":24.5,"range":10000,"variable":"location"}
municipality_code (Municipality) has any / top N Usage example: {"value":[2],"variable":"municipality_code"}
postal_code (Postal code) has any / top N Usage example: {"value":["00100","00120"],"variable":"postal_code"}
pred_car_ownership (Car ownership) scalar range / decile Car ownership prediction are shown in classes 1-10. Class 1: no. Class 10: yes. Usage example: {"min":1,"max":10,"variable":"pred_car_ownership"}
pred_direct_marketing_preference (Direct marketing preference) scalar range / decile Estimate on person's attitude toward direct marketing. Decile 1: negative. Decile 10: positive. Usage example: {"min":1,"max":10,"variable":"pred_direct_marketing_preference"}
pred_family_with_children (Households with children under 18 years) scalar range / decile Under 18-year-old minors live in this household are shown in classes 1-10. Class 1: no. Class 10: yes. Usage example: {"min":1,"max":10,"variable":"pred_family_with_children"}
pred_family_with_children_10_to_17 (Households with children over 10 years) scalar range / decile Over 10 years old minors live in this household are shown in classes 1-10. Class 1: no. Class 10: yes. Usage example: {"min":1,"max":10,"variable":"pred_family_with_children_10_to_17"}
pred_family_with_children_under_10 (Households with children under 10 years) scalar range / decile Under 10 years old minors live in this household are shown in classes 1-10. Class 1: no. Class 10: yes. Usage example: {"min":1,"max":10,"variable":"pred_family_with_children_under_10"}
pred_household_debt (Household's debts) scalar range / decile Estimated debts (household) are shown in classes 1-10. Class 1: less than 4 025 €. Class 10: over 93 169 €. Usage example: {"min":1,"max":10,"variable":"pred_household_debt"}
pred_household_education (Household's education level) scalar range / decile Estimated education level (household) is shown in classes 1-10. Class 1: low educated. Class 10: highly educated. Usage example: {"min":1,"max":10,"variable":"pred_household_education"}
pred_household_income (Household's ordinary income) scalar range / decile Estimated ordinary income (household) is shown in classes 1-10. Class 1: less than 18 061 €/year. Class 10: over 78 286 €/year. Usage example: {"min":1,"max":10,"variable":"pred_household_income"}
pred_household_income_from_capital (Household's capital gains) scalar range / decile Estimated capital gains (household) are shown in classes 1-10. Class 1: less than 271 €/year. Class 10: over 4 296 €/year. Usage example: {"min":1,"max":10,"variable":"pred_household_income_from_capital"}
pred_person_debt (Person's debts) scalar range / decile Estimated debts (personal) are shown in classes 1-10. Class 1: less than 3 332 €. Class 10: over 54 729 €. Usage example: {"min":1,"max":10,"variable":"pred_person_debt"}
pred_person_education (Person's education level) scalar range / decile Estimated education level (personal) is shown in classes 1-10. Class 1: low educated. Class 10: highly educated. Usage example: {"min":1,"max":10,"variable":"pred_person_education"}
pred_person_income (Person's ordinary income) scalar range / decile Estimated ordinary income (personal) is shown in classes 1-10. Class 1: less than 15 287 €/year. Class 10: over 42 263 €/year. Usage example: {"min":1,"max":10,"variable":"pred_person_income"}
pred_person_income_from_capital (Person's capital gains) scalar range / decile Estimated capital gains (personal) are shown in classes 1-10. Class 1: less than 144 €/year. Class 10: over 3 095 €/year. Usage example: {"min":1,"max":10,"variable":"pred_person_income_from_capital"}
pred_residency (Person's ownership of home) scalar range / decile Person's living form estimated probability is shown in classes 1-10. Class 1: living in owner-occupied housing, Class 10: lives in rented housing. Usage example: {"min":1,"max":10,"variable":"pred_residency"}
pred_tele_marketing_preference (Telemarketing preference) scalar range / decile Estimate on person's attitude toward telemarketing. Decile 1: negative. Decile 10: positive. Usage example: {"min":1,"max":10,"variable":"pred_tele_marketing_preference"}
robinson (Robinson marketing prohibition) has any or is null / none Usage example: {"value":["tele","post"],"variable":"robinson"}
suomi_360 (Suomi 360 -profile) has any / classification Suomi 360 life stage profile classifies whole population into eight groups. Usage example: {"value":[1,3,5,8],"variable":"suomi_360"}
valuegraphics_9_classes () has any / classification Sosio-cultural model that describes values beyond behavior. Usage example: {"value":[1,3,5,8],"variable":"valuegraphics_9_classes"}
Tulosta

Läs mer om våra tjänster på www.bisnode.com Bisnode