Get the latest data for each country
The traffic data is now filtered following the business rules. It still contains the data for all the available years per country. You are only interested in the latest traffic data.
You know a simple way to get what you want. First, you group the data using the three-letter country code. Since the data is already sorted descending by year, you can get the first row of each group to get the latest available traffic data for each country. Nifty!
from robocorp.tasks import task
from RPA.HTTP import HTTP
from RPA.JSON import JSON
from RPA.Tables import Tables
http = HTTP()
json = JSON()
table = Tables()
TRAFFIC_JSON_FILE_PATH = "output/traffic.json"
@task
def produce_traffic_data():
"""
Inhuman Insurance, Inc. Artificial Intelligence System automation.
Produces traffic data work items.
"""
http.download(
url="https://github.com/robocorp/inhuman-insurance-inc/raw/main/RS_198.json",
target_file=TRAFFIC_JSON_FILE_PATH,
overwrite=True,
)
traffic_data = load_traffic_data_as_table()
filtered_data = filter_and_sort_traffic_data(traffic_data)
filtered_data = get_latest_data_by_country(filtered_data)
@task
def consume_traffic_data():
"""
Inhuman Insurance, Inc. Artificial Intelligence System robot.
Consumes traffic data work items.
"""
print("consume")
def load_traffic_data_as_table():
json_data = json.load_json_from_file(TRAFFIC_JSON_FILE_PATH)
return table.create_table(json_data["value"])
def filter_and_sort_traffic_data(data):
rate_key = "NumericValue"
max_rate = 5.0
gender_key = "Dim1"
both_genders = "BTSX"
year_key = "TimeDim"
table.filter_table_by_column(data, rate_key, "<", max_rate)
table.filter_table_by_column(data, gender_key, "==", both_genders)
table.sort_table_by_column(data, year_key, False)
return data
def get_latest_data_by_country(data):
country_key = "SpatialDim"
data = table.group_table_by_column(data, country_key)
latest_data_by_country = []
for group in data:
first_row = table.pop_table_row(group)
latest_data_by_country.append(first_row)
return latest_data_by_country
- The
get_latest_data_by_country()
function communicates the intent of the code by using clear naming. - The
group_table_by_column()
function handles the grouping by country. - The
pop_table_row()
function gets the first row of a table. - This
RPA.Tables
library feels quite helpful indeed!