Science Learning Hub logo
TopicsConceptsCitizen scienceTeacher PLDGlossary
Sign in
Article

Data – key terms

This resource provides explanations of some of the key terms associated with data (raraunga) and data use.

  • Algorithm (hātepe)

  • Artificial intelligence (whakataruna hinengaro)

  • Big data

  • Data mining (taurite raraunga)

  • Data scraping

  • Data sovereignty

  • Indigenous data sovereignty

  • Large language model

  • Machine learning

  • Māori data

  • Māori data sovereignty

  • Privacy (tūmataitinga)

  • Qualitative data (raraunga kounga)

  • Quantitative data (raraunga tatau)

  • Storage (rokiroki)

  • Time series data (raraunga houanga)

Algorithm (hātepe)

An algorithm is a list of steps that help solve a problem or complete a task. Search engines use algorithms to provide search results. However, search engines and social media platforms use AI-enhanced algorithms to analyse our habits and interests. They can be used to determine what we see and don’t see. Learn more about algorithms in the Connected article Amazing algorithms and in Online algorithms, biases and incorrect information.

Example of text generated by the AI large language model ChatGPT

ChatGPT on ChatGPT

See more

An example of text generated by the AI large language model ChatGPT.

Generated by ChatGPT.

Rights: The University of Waikato Te Whare Wānanga o Waikato
Referencing Hub media

Artificial intelligence (whakataruna hinengaro)

Artificial intelligence (AI) is technology that enables computer systems to simulate human behaviour such as learning and comprehension, problem solving, decision making and creativity. AI is utilised in robotics to perform complex physical tasks without having to fully define each separate action.

The article Artificial intelligence provides a concise introduction to different types of AI. Bots vs Beings – the impacts of AI on life and work has information and panel discussion videos.

Big data

Big data refers to datasets that are very large and/or very complex and are difficult or very resource intensive to process using traditional data-processing tools. The data may combine multiple sources, and it takes skill to extract meaning from it. Big data is often defined by the seven Vs:

  • Volume: How much data – enormous amounts of data are created every second of every day.

  • Velocity: Speed of processing to accessibility – dealing with the volume of data as it comes in.

  • Variety: Types and organisation of the data – data can be structured (datasets) and unstructured (social media, photos, etc.)

  • Veracity: The quality of the data – ensuring it is trustworthy.

  • Variability: Whether the same data is consistent over time – for example, a value every set period or a value with the same units each time. Understanding the context and meaning of the data.

  • Visualisations: Making data understandable to experts and non-experts by using charts, graphs, etc.

  • Value: Whether the results are useful – using big data for problem solving and/or decision making.

The articles Data about data and Data and how we use it offer additional information.

Data mining (taurite raraunga)

Data mining uses data analysis software to discover anomalies, patterns and correlations within large volumes of data. Examples of use are in healthcare to diagnose diseases or in finance to detect fraud. The articles Data about data and Data and how we use it offer additional information.

Computer tablet with recipe beside tomatoes, garlic and olive oi

Online recipe

See more

Online chefs use extraneous text to hide their work from data scrapers.

Background photo by fabiobalbi, 123RF Ltd. 

Rights: The University of Waikato Te Whare Wānanga o Waikato
Referencing Hub media

Data scraping

Data scraping is the practice of training computers to pull down large amounts of data from another program’s end-user output without proper integration. A scraper program accesses a website, navigates through its structure and extracts displayed data. Its purpose often is to train machine learning models, for market research and/or for analysis. The articles Data about data and Data and how we use it offer additional information.

Data sovereignty

Data sovereignty is about ensuring that we have authority and control over our own data, encompassing aspects such as collection, ownership, storage and usage. It is important for safeguarding privacy, ensuring self-determination and preserving cultural, economic and social interests tied to the data. Data sovereignty allows individuals or communities to determine how their data is managed and shared in accordance with their values and priorities. Find out more in the article ChatGPT and Māori data sovereignty.

Indigenous data sovereignty

Indigenous data sovereignty recognises that data is more than information – it encompasses wellbeing, self-determination and cultural preservation. It recognises the right of indigenous people to govern the collection, ownership and application of their own data as society becomes more digitised – this includes data relating to health, education and other social realms. See the CARE Principles for Indigenous Data Governance. These were first drafted in 2018 by a panel that included experts from Aotearoa, Australia, Africa and the Americas.

Large language model

A large language model (LLM) is a type of text-based artificial intelligence that uses deep learning and enormous datasets to summarise, predict and generate written content. LLMs are trained with self-supervised learning. Most LLMs are becoming more accurate with time but they do have limitations, including hallucinations (making up information), bias and information that might not be current. The article ChatGPT – generating text and ethical concerns goes into more depth about some of the ethical questions raised by LLMs such as ChatGPT and others.

Machine learning

Machine learning describes any process where a computer figures out how to perform a task by itself. Simply explained, a machine (computer) has access to information and instructions and uses algorithms to find patterns leading to success or failure of the task. Iteration – using small changes in the algorithm – leads to improved decision trees and processes. The article Artificial intelligence has additional information about this topic.

Māori data

Professor Tahu Kukutai defines Māori data as “information or knowledge in a digital or digitisable form that is about or from Māori peoples and our environments, regardless of who controls it”. This includes data from Māori, data about Māori and data about Māori resources.

Colourful poster on Māori data sovereignty

Māori data sovereignty

See more

Graphic illustration completed by Fuselight Creative during a presentation by Dr Te Taka Keegan titled Issues with Māori Sovereignty over Māori Language Data. 

The presentation was given at the HELISET TŦE SḰÁL ‘Let the Languages Live’ – 2019 International Conference on Indigenous Languages hosted by the First Peoples’ Cultural Council and First Peoples’ Cultural Foundation in Victoria B.C., Canada.

Original illustration © Fuselight Creative.

Rights: The University of Waikato Te Whare Wānanga o Waikato
Referencing Hub media

Māori data sovereignty

Māori data sovereignty recognises that Māori have the right to govern the collection, ownership and application of Māori data. The article ChatGPT and Māori data sovereignty explores some of the cautions and promises that Dr Te Taka Keegan sees in the future of LLMs.

Privacy (tūmataitinga)

Digital privacy is the ability to control and protect one’s own personal information while using the internet. This includes what data is collected, how it is stored and how it is used. The article Data about data offers information about owning and protecting your data.

Qualitative data (raraunga kounga)

Qualitative data is descriptive rather than numerical. Qualitative research methods often use interviews, observations and case studies to gain information.

Quantitative data (raraunga tatau)

Quantitative data is measurable via numbers and statistics. Quantitative research methods often use experiments and statistical analysis to gain information.

Storage (rokiroki)

Digital data storage can be local – storing information on a personal device or hard drive. Another option is via cloud storage – using remote servers that can be accessed from various devices. With both options, users should consider data security and reliability of access.

Soil moisture deficit map of New Zealand 2016.

Soil moisture deficit map

See more

A soil moisture deficit map indicates whether soils have a water surplus, are at field capacity or have a water deficit.

Rights: NIWAThis work is licensed under a Creative Commons Attribution 4.0 International Licence
Referencing Hub media

Time series data (raraunga houanga)

Time series data is data that is recorded over consistent intervals of time. The data reflects the changing state of a phenomenon over time. This soil moisture deficit map shows soil moisture levels over 30-year and 1-year periods.

Related content

Data is a complex concept. Learn more about internet data in the articles Data about data and Data and how we use it.

Many online citizen science projects use citizen scientists’ data to help machine learning algorithms – see some examples here.

Useful links

See the Technology section on the Tāhūrangi New Zealand Curriculum website.

More reo Māori terms related to the digital world:

  • Dictionary of Māori computer and social media terms and Compendium of Māori data sovereignty both from Dr Karaitiana Taiuru’s website. There are many other resources here too.

  • Kia kaha te reo hangarau! Technology words – give it a go! from the Ministry of Education | Te Tāhuhu o Te Mātauranga.

  • Te reo Māori digital technology terms from IT Professionals New Zealand.

Osmos has a comprehensive list of data terms.

Netsafe helps New Zealanders take advantage of the digital opportunities available while managing online challenges.

Explore the resources in the Artificial intelligence archived section on the the Prime Minister’s Chief Science Advisor website.

The Royal Society Te Apārangi Mana raraunga | Data sovereignty 2023 report outlines what data sovereignty is and why it matters in Aotearoa New Zealand.

Te Mana Raraunga | Māori Data Sovereignty Network has useful information on data sovereignty and other related aspects, including its charter document.

Glossary

Published: 21 August 2025
Referencing Hub articles

Explore related content

ChatGPT and Māori data sovereignty

Article

ChatGPT and Māori data sovereignty

We hear a lot about data, but what is it? In simple terms, data is a collection of unorganised numbers, ...

Read more
Data and how we use it

Article

Data and how we use it

Data is the precursor to information. It’s an unorganised collection of values expressed as numbers, text or symbols. That may ...

Read more
Data about data

Article

Data about data

Data has become so central to everything we do that it has its own branch of research. The emergent field ...

Read more

See our newsletters here.

NewsEventsAboutContact usPrivacyCopyrightHelp

The Science Learning Hub Pokapū Akoranga Pūtaiao is funded through the Ministry of Business, Innovation and Employment's Science in Society Initiative.

Science Learning Hub Pokapū Akoranga Pūtaiao © 2007-2025 The University of Waikato Te Whare Wānanga o Waikato