Head/Tail Breaks: A New Classification Scheme for Data with a Heavy-Tailed Distribution

Type: Article

Publication Date: 2012-07-30

Citations: 341

DOI: https://doi.org/10.1080/00330124.2012.700499

Abstract

This article introduces a new classification scheme—head/tail breaks—to find groupings or hierarchy for data with a heavy-tailed distribution. The heavy-tailed distributions are heavily right skewed, with a minority of large values in the head and a majority of small values in the tail, commonly characterized by a power law, a lognormal, or an exponential function. For example, a country's population is often distributed in such a heavy-tailed manner, with a minority of people (e.g., 20 percent) in the countryside and the vast majority (e.g., 80 percent) in urban areas. This new classification scheme partitions all of the data values around the mean into two parts and continues the process iteratively for the values (above the mean) in the head until the head part values are no longer heavy-tailed distributed. Thus, the number of classes and the class intervals are both naturally determined. I therefore claim that the new classification scheme is more natural than the natural breaks in finding the groupings or hierarchy for data with a heavy-tailed distribution. I demonstrate the advantages of the head/tail breaks method over Jenks's natural breaks in capturing the underlying hierarchy of the data.

Locations

  • arXiv (Cornell University) - View - PDF
  • DataCite API - View
  • The Professional Geographer - View

Similar Works

Action Title Year Authors
+ Basic steps of analysis for heavy-tailed distributions: visualizing, fitting, and comparing. 2014 Jeff Alstott
Bullmore Ed
Dietmar Plenz
+ PDF Chat Heavy‐tailed densities 2012 Javier Rojo
+ Mastering the body and tail shape of a distribution 2019 Matthias Wagener
Mohammad Arashı
+ PDF Chat Introduction 2022
+ Advanced in Heavy-tailed Distribution Tail Index Estimation 2012 Xing Hong-wei
+ Heavy‐Tailed Distributions 2014 Maurice C. Bryson
+ PDF Chat New Technique to Estimate the Asymmetric Trimming Mean 2010 Ahmad M. H. Al-Khazaleh
Ahmad Mahir Razali
+ PDF Chat Heavy-Tailed Distributions: Data, Diagnostics, and New Developments 2011 Roger Cooke
Daan Nieboer
+ How Much Data Do You Need? An Operational, Pre-Asymptotic Metric for Fat-tailedness 2018 Nassim Nicholas Taleb
+ Heavy or semi-heavy tail, that is the question 2020 Jamil Ownuk
Hossein Baghishani
Ahmad Nezakati
+ A generalized boxplot for skewed and heavy-tailed distributions 2014 Christopher Bruffaerts
Vincenzo Verardi
Catherine Vermandele
+ PDF Chat Measuring heavy-tailedness of distributions 2017 Pavlina Jordanova
Monika P. Petkova
+ Modeling light-tailed and right-skewed data with a new asymmetric distribution 2016 Meitner Cadena
+ Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology, and Applications 2020 Nassim Nicholas Taleb
+ Robust outlier labeling rules for light-tailed and heavy-tailed Data 2019 Kelly Cristina Ramos da Silva
+ Fat Tails Quantified and Resolved: A New Distribution to Reveal and Characterize the Risk and Opportunity Inherent in Leptokurtic Data 2011 Lawrence R. Thorne
+ Fat Tails Quantified and Resolved: A New Distribution to Reveal and Characterize the Risk and Opportunity Inherent in Leptokurtic Data 2011 Lawrence R. Thorne
+ The Trick of the Tail: Segmenting Heavy-Tailed Distributions 2024 Jonathan Dunne
Sonya Leech
Markus U. Müller
Irene Manotas
Mary Swift
+ Skewed Pivot-Blend Modeling with Applications to Semicontinuous Outcomes 2024 Yiyuan She
Xiaoqiang Wu
Lizhu Tao
Debajyoti Sinha
+ Sv-plots for identifying characteristics of the distribution and testing hypotheses 2020 Uditha Amarananda Wijesuriya