Important Dates

Data Description

The NTCIR-13 Lifelog data consists of at least 45 days of data from two active lifeloggers. The full phase-2 dataset contains the following data:


  • Narrative Clip 2. Set at 45 second interval.. From breakfast to sleep. This is about 1,500 images per day. There is an accompanying output of a concept detector to assist teams in building a search engine for the data.
  • Music listing history (see an example of music listening history here)

  • Biometrics 24x7 (heart rate, galvanic skin response, calorie burn, steps)
  • Blood Pressure daily, in the morning after preparing (but before eating) breakfast and before exercising
  • Blood Sugar levels every morning after waking up, before eating.

Human Activity
  • Semantic locations visited
  • Physical activities
  • Daily mood, according to Thayers 2 dimensional model of mood
  • Diet log (manual logging of photos of food).

Computer Usage (as document vectors on a per-minute basis)
  • Computer input via keyboard and information consumed on the computer via ASR of on-screen activity on a per-minute basis. This data is filtered using a blacklist, anonymised and then stemmed using an English language stemmer. Each minute is represented by a sorted document vector.

Baseline Search Engine

In order to assist participating groups, a baseline search engine has been developed by the organisers. It is accessible here ( and can be used to provide basic queries to the system. Queries can be submitted that filter images by userID, location and/or physical activity. Visual concept search will be added on 30th June.

Registration and Data Release Forms

Every participating group must firstly register with NTCIR and indicate their intention to partake in the lifelog task. This can be done by following this link registering for the Lifelog task at NTCIR-13.

Once this registration with NTCIR is completed, the NTCIR-Lifelog's participants are required to sign two forms to access the datasets, an organisational agreement form for your organisation (signed by the research team leader) and an individual agreement form for each member of the research team that will access the data. The organisation agreement form should be sent to the lifelog task organisers ( in PDF format. The individual agreement form must be signed by all researchers who will use the data and kept by the organisation on file. It should not be sent to the organisers, unless requested at a later date.

  1. Organisation Agreement form: to be signed by the organisation to which the participants belong. This form must be signed and sent by email to NTCIR-Lifelog organisers (
  2. Individual Agreement form: to be signed by each individual researcher wishing to use the NTCIR-Lifelog data collection. This form must be filed by the participating organisation, but it does not need to be sent to the lifelog organisers.

Upon completion of this process, the participants will be sent a unique username and password to access the dataset. Please see the section below.

Access to the LifeLog datasets

The datasets can be downloaded below. Each link is password protected and each organisation will receive a unique username and password to access the data. To get these access codes, please email the organisers ( with the signed organisation agreement form in attachment.

Format of the NTCIR-13 LifeLog datasets

The root of the ZIP files contains an .xml file, which is a simple aggregation of all users data. It is structured as follows:

The root node of the data is the USERS tag. Each user element contains all the data of that user (u1 or u2). Each user has a tag USER that contains the user ID as an attribute, example: [user id="u1”]. Inside the USER element, is his/her data:

Following that there is a tag DAYS, this tag contains the lifelogging information of that user organised per day, each day is included in a tag DAY that has the data (a tag DATA), the relative path to the directory that contains the images captured in that particular day (the tag IMAGES-DIRECTORY), then the minutes of of that day under a root tag called MINUTES.

At the start of each day there is a set of daily metatdata for that user. This data is of three forms; BIOMETRICS, ACTIVITIES & PERSONAL LOGS. The biometrics contains WEIGHT, FAT MASS, HEART RATE, SYSTOLIC blood pressure & DIASTOLIC blood pressure, which were readings taken after waking up each day. The activities contains summary activities: STEPS taken that day, DISTANCE walked in metres that day & ELEVATION climbed in metres that day. The personal logs contain HEALTH LOGS, including the TIME of reading, GLU Glucose levels in the blood, BP Blood Pressure, HR Heart Rate, MOOD manually logged every morning and sometimes a COMMENT, as well as DRINK LOGS and FOOD LOGS which were manually logged throughout the dat.

Following that, the day’s data is organised into minutes. The MINUTES element, contains exactly 1440 child elements (called MNUTE), each child has an ID (example: [minute id=“0”], [minute id=“1”], [minute id=“2”]... etc), and it represent one minute in the day ordered from 0 = 12:00 AM, to 1439 = 23:59PM.

Each minute contains: 0 or 1 location information (LOCATION tag), 0 or one activity information (ACTIVITY tag), biometrics, 0 or more captured images (IMAGES tag with IMAGE child element (each element has has a relative path to the image and a unique image ID), and 0 or 1 MUSIC tag giving details of the music listened to at that point in time.

-The location information is captured by Moves app (, and they represent to semantic locations (Home, Work, DCU Computing building, GYM, Name of a Store, etc…), or to landmark locations registered by Moves. This tag can contain information in several languages. For locations that are not (HOME) or (WORK), the GPS locations are provided.

example of the XML file is for one minute is provided, along with examples of the daily metadata for u1 (user 1).