Explore our IP Address Database Downloads for instant access to our IP address insights

Learn more
2 years ago by IPinfo Team 7 min read

IP Data Enrichment: 4 Ways To Add Insights To Your Logs

IP Data Enrichment With IPinfo - IPinfo blog

One of the most common operations to gain insights from IP addresses logged in a system or server log is to enrich them with IPinfo’s data. You can enrich your IP log with IPinfo’s API or database services that include IP to geolocation, IP to privacy detection, IP to company information, IP to mobile carrier information, and more.

However, even though at first glance this process might feel straightforward, there are a few tricks as you may have a huge log of IP data. Plus, our databases are quite big as well. In this article, we are going to look into some optimized and efficient ways to enrich your IP log data with our IP database and API service. So, let’s get started.

IPinfo’s essential open-source tools

Before we get started, we need to start some essential open-source tools created by IPinfo’s expert data team. The installation of these tools is pretty straightforward and our documentation provides you with a clear path to get up and running.

Tools Description
IPinfo CLI Official Command Line Interface of IPinfo. Enables you to interact with the API and provides you with some essential features to work with IP information and databases.
IPinfo MMDBctl IPinfo’s MMDB management and query tool.

Both of these tools are free to use and are open source. Feel free to explore them to learn about their features and options.

Server log or IP address database

To get started it would be best if you have your own web server log containing IP addresses, however, you might not have access to that at this moment. So for testing purposes, let’s start with some IPinfo CLI tricks that can help you with that.

Generating random IP addresses

You can use the randip command on the IPinfo CLI to generate random IP addresses. The command supports a variety of options such as generating only IPv4 addresses, only IPv6 addresses, excluding reserved IP addresses, specifying IP address ranges, etc.

🔗 Documentation on the randip command

To keep it simple, you can start by generating random IPv4 addresses:

ipinfo randip -n 50 > ips.txt

We are going to save these random IPv4 addresses to the ips.txt file.

Generating random IP addresses with randip command

The randip command can generate random IPv6 addresses with the command option -6, however, there is a caveat. The vast majority of IPv6 addresses are not assigned to a device compared to its IPv4 counterpart. If you are generating random IPv6 addresses there is a high probability that you will not be able to find any information about them.

Extracting IP addresses from a text file

If you already have access to some server or traffic log database, you can easily extract the IP addresses from them as well. Using our IPinfo CLI, if you run the grepip command, you can extract the IP addresses fairly easily.

🔗 Documentation on the grepip command.

We started with a server log file that I found on the internet. Rather than attempting to parse and structure it, we can simply extract only the IP addresses and store them in the ips.txt file using the IPinfo CLI’s grepip command.

ipinfo grepip -o access_log.log > ips.txt
Extracting IP addresses from a text file using the grepip command

Like randip, grepip comes with a few options such as matching only IPv4 or IPv6 addresses, excluding reserved IPs, etc.

Now that we have an IP address dataset we can work with, let’s get started with enriching it with IPinfo data.

Method 1: Using the API service with our CLI

If you want to enrich bulk IP addresses at one time, you can use the IPinfo CLI. To use the bulk lookup feature you need to first authorize the IPinfo CLI with the command ipinfo login then you can provide your access token. The CLI uses your API access to run the bulk operation.

As we have prepared an ips.txt file that contains all the IP addresses, we can simply pass it to the IPinfo CLI and it will generate a file enriched with IP insights.

The CLI will bulk upload all the IP addresses to our batch operation API endpoint. From that, the CLI outputs the result in CSV or JSON format. The default output is JSON, and to output it as CSV you must add the -c option.

As CSV is the easiest way to ingest databases for further analysis, we are going to use the CSV option and output it to the ipinfo_ip.csv file.

cat ips.txt | ipinfo -c > ipinfo_ip.csv
Enriching IP log with the IPinfo API

The ipinfo_ip.csv file will contain the IP information we have to offer on the IP address dataset. The level of IP information we provide depends on your pricing tier.

Method 2: Using our databases and mmdbtctl

If you want to use our databases to enrich your IP address log, you should use the MMDB file format thus our mmdbctl utility.

Although you have access to the CSV and JSON database file formats, for simple and direct log enrichment in a batch process your best option is the MMDB database. Find out which file format suits your needs best from this article.

🔗 Documentation on the mmdbctl tool

The MMDB database format provides the fastest and most reliable way to enrich your IP data. The amount of information you will get by looking up the IP addresses depends on the database you are using. For this example, we are using the IP to Geolocation database, which provides IP location information such as city, region, zip code, geographic coordinates, etc.

With the mmdbctl tool, we are running the read command and declaring the output format to be CSV with -f csv, and saving the result to the ipinfo_ip.csv file.

mmdbctl read -f csv ips.txt location.mmdb > ipinfo_ip.csv
Enriching IP log with IPinfo's database

After running the above command, the ipinfo_ip.csv file will contain the IP addresses and their respective IP geolocation information. It’s important to note that the output database will not contain information on bogon IP addresses, since they are not included in the IP geolocation database.

Method 3: Using a programming language

The solutions we just learned are limited to the terminal. However, you might want to enrich your IP data within a programming language environment. In that instance, you have a few options as well.

API-based log enrichment

To use an API-based solution with a programming language, you have two options:

  1. You can use our batch API endpoint.
  2. Or, you can use one of our many official open-source libraries which support batch and bulk lookups.

The batch API endpoint provides you with several options. Such as a JSON array or a newline-separated list of URLs. The process is well documented on our documentation page.

Here is an example of doing log enrichment using the Python programming language and our official open-source Python library.

# our official Python library
import ipinfo

# contains 200 IP addresses
with open("./ips.txt") as file:
    ips = file.read().strip().split("\\n")

# get your access token from here: <https://ipinfo.io/account/token>
access_token = "YOUR_TOKEN"

# <https://github.com/ipinfo/python#batch-operations>
handler = ipinfo.getHandler(access_token)

# `data` contains your enriched IP data in a dictionary format
data = handler.getBatchDetails(ips)

The API-based solution provides a fast and efficient way to enrich log data. If you are using the official libraries, you also get additional insights such as for geolocation lookup you get continent name, European country check, currency, and many other attributes.

Database based enrichment

Even though our bulk/batch API-based solution provides a fast solution, however, you simply just can’t beat the speed of a database lookup solution, particularly if it is an MMDB file.

The API-based log enrichment process usually takes a second or two. However, when you look up the data from the IPinfo’s mmdb format database, that is a different story.

# mmdb reader library
import maxminddb

# contains 200 IP addresses
with open("./ips.txt") as file:
    ips = file.read().strip().split("\\n")
    
ipinfo_reader = maxminddb.open_database("./location.mmdb")

data = [ipinfo_reader.get(ip) for ip in ips]

Looking up the geolocation information of 200 IP addresses from our database takes around one-tenth of a second. While this operation takes about a second if you are using our API, you can join the IP geolocation database results with other IPinfo databases as well.

Method 4: Using the bulk upload page on your account dashboard

If one-time enrichment is your goal and you don’t have the time or the need to open your terminal or IDE, we’ve got you covered. You can do bulk/batch IP enrichment features right from our website. The bulk upload tool is available in your account dashboard.

All you have to do is upload or drag and drop your list of IP addresses in the tool. Then as soon as we are done processing, the enriched data will download automatically.

IP data enrichment with bulk upload

And that concludes this article. One thing to note is that the level of data enrichment is limited to your API tier or database download access, So feel free to check out our API data types and database types to find what fits your need the best.

If you need further assistance in enriching your data with IPinfo’s insights, reach out to our data experts today.

About the author

IPinfo Team

IPinfo Team

Internet Data Expert