Skip to content

Athletes Visualization โ€‹

TIP

Source ยท This is a genji example to create interactive data reports with plots and inputs.

Here is a athlete dataset for Asian Games 2023. As we see, the data is tabular, and one row correspond to one athlete.

js
Inputs.table(data);

We have access, for each athlete (or data point), to his birthday, gender (0 for male and 1 for female), age, height, sport, noc (country), gold, silver and bronze.

Let's make some plots with this dataset using Observable Plot ๐Ÿš€

General โ€‹

First, we create a one-dimension bubble chart with a radius encoding representing count of athletes, binned by the selected numerical dimension. We can see that the majority of athletes were born around 2001, making them roughly 22 years old, with a median height of about 170 cm.

js
bubbleX = Inputs.radio(["birthday", "age", "height"], {
  label: "Bin dimension",
  value: "birthday",
});
js
Plot.plot({
  width,
  r: { range: [0, 14] },
  marks: [
    Plot.dot(
      data,
      Plot.binX(
        { r: "count" },
        {
          x: bubbleX,
          tip: true,
        }
      )
    ),
  ],
});

Sports โ€‹

Then we compare the distributions of selected dimension for athletes within each sports using a fill encoding representing proportion of athletes. Sports are sorted by the median value of the chosen dimension: artistic gymnastics tend to be the shortest, and volleyball players the tallest; rhythmic gymnastics tend to be the youngest, and bridge players the oldest.

js
heatmapX = Inputs.radio(["height", "age"], {
  label: "Bin dimension",
  value: "height",
});
js
Plot.plot({
  marginLeft: 130,
  marginTop: 10,
  width,
  x: { grid: true },
  fy: {
    domain: d3.groupSort(
      data.filter((d) => d[heatmapX]),
      (g) => d3.median(g, (d) => d[heatmapX]),
      (d) => d.sport
    ),
    label: null,
  },
  color: { scheme: "YlGnBu", legend: true },
  marks: [
    Plot.rect(
      data,
      Plot.binX(
        { fill: "proportion-facet" },
        { x: heatmapX, fy: "sport", inset: 0.5 }
      )
    ),
  ],
});

Next, blow we use directional arrows to indicate the difference in counts of male and female athletes by sport. The color of the arrow indicates which sex is more prevalent, while its length is proportional to the difference. We observe that most sports are male-dominated, with football having the highest number of players and golf the fewest when ranking sports by player count.

js
linkOrder = Inputs.radio(["null", "ascending y", "descending y"], {
  label: "Order by",
  value: "null",
});
js
Plot.plot({
  marginBottom: 125,
  x: {
    label: null,
    tickRotate: 90,
    domain: d3.groupSort(
      data,
      (g) =>
        linkOrder === "null"
          ? true
          : linkOrder === "ascending y"
          ? g.length
          : -g.length,
      (d) => d.sport
    ),
  },
  y: { grid: true, label: "Frequency" },
  color: {
    type: "categorical",
    domain: [-1, 1],
    unknown: "#aaa",
    transform: Math.sign,
    legend: true,
    tickFormat: (d) => (d === -1 ? "Female Prevalent" : "Male Prevalent"),
  },
  marks: [
    Plot.ruleY([0]),
    Plot.link(
      data,
      Plot.groupX(
        {
          y1: (D) => d3.sum(D, (d) => d === "Female"),
          y2: (D) => d3.sum(D, (d) => d === "Male"),
          stroke: (D) =>
            d3.sum(D, (d) => d === "Male") - d3.sum(D, (d) => d === "Female"),
        },
        {
          x: "sport",
          y1: sex,
          y2: sex,
          markerStart: "dot",
          markerEnd: "arrow",
          stroke: sex,
          strokeWidth: 2,
        }
      )
    ),
  ],
});

Countries โ€‹

Now we draw a bar chart below shows a distribution of athletes by countries. Countries are sorted by player count: Thailand (THA) has the largest players group, Brunei Darussalam (BRU) has the smallest. If we do not stack bars, we can find that male players outnumber female players in most of the countries according to the fill of left ranges.

js
stack = Inputs.toggle({ label: "Stack", value: true });
js
Plot.plot({
  marginLeft: 50,
  marginTop: 0,
  width,
  color: { legend: true },
  y: {
    label: "Country",
    domain: d3.groupSort(
      data,
      (d) => -d.length,
      (d) => d.noc
    ),
  },
  x: {
    grid: true,
    label: "Count โ†’",
  },
  marks: [
    Plot.barX(
      data,
      Plot.groupY(stack ? { x: "count" } : { x1: "count" }, {
        y: "noc",
        fill: sex,
        tip: true,
        mixBlendMode: stack ? null : "multiply",
      })
    ),
    Plot.ruleX([0]),
  ],
});

In order to learn something about winners, we draw a chart below. Each rect repents a country: x encodes the countries's participants, while y encodes the proportion of that participants winning games; hence area represents the number of participants winning in games. Rects are stacked along x in order of descending y. We can observe that less than 50% of participants won games and China (CHN) has the highest winning rate, about 76%.

js
Plot.plot({
  x: { label: "Participants โ†’" },
  y: {
    nice: true,
    percent: true,
    label: "โ†‘ Winning rate (%)",
  },
  width,
  marks: [
    Plot.rectY(
      d3
        .groups(data, (d) => d.noc)
        .map(([key, group]) => ({
          noc: key,
          total: group.length,
          ratio:
            d3.sum(group, (d) => (d.gold || d.silver || d.bronze ? 1 : 0)) /
            group.length,
        })),
      Plot.stackX({
        order: "ratio",
        x: "total",
        y2: "ratio",
        reverse: true,
        insetLeft: 0.2,
        insetRight: 0.2,
        tip: true,
        channels: {
          country: "noc",
        },
      })
    ),
  ],
});

Medals โ€‹

At last, we draw a beeswarm plot to observe players of each country. Each circle represents a athlete: x encodes birthday, fill encodes award status, while r encodes the number of medals. We can see most of the players won gold medals in China, especially female players, who seems to be more "goldener" than the male. Feel free to explore you own country with the search and select input!

js
countries = Inputs.search(Array.from(new Set(data.map((d) => d.noc))), {
  label: "Search",
});
js
country = Inputs.select(countries, { label: "Country", value: "CHN" });
js
maxR = Inputs.range([5, 10], { label: "Radius", step: 0.1, value: 6.5 });
js
dodgeP = Inputs.range([-1, 5], { label: "Padding", step: 0.1, value: 1 });
js
Plot.plot({
  height: 720,
  marginLeft: 50,
  width,
  fy: { padding: 0 },
  x: { grid: true, nice: true },
  color: {
    type: "categorical",
    legend: true,
    tickFormat: (d) => ["Gold", "Silver", "Bronze", "None"][d],
    domain: [0, 1, 2, 3],
    range: ["#F6BD16", "#5D7092", "#CE8032", "#aaa"],
  },
  r: { range: [2, maxR] },
  marks: [
    Plot.frame({ fy: "Female", stroke: "currentColor", anchor: "bottom" }),
    Plot.dot(
      data.filter((d) => d.noc === country).filter((d) => d.gender === 1),
      Plot.dodgeY({
        x: "birthday",
        fy: sex,
        fill: (d) => (d.gold ? 0 : d.silver ? 1 : d.bronze ? 2 : 3),
        sort: { channel: "fill" },
        r: (d) => d.gold + d.silver + d.bronze,
        tip: true,
        padding: dodgeP,
      })
    ),
    Plot.dot(
      data.filter((d) => d.noc === country).filter((d) => d.gender === 0),
      Plot.dodgeY({
        x: "birthday",
        fy: sex,
        fill: (d) => (d.gold ? 0 : d.silver ? 1 : d.bronze ? 2 : 3),
        sort: { channel: "fill" },
        r: (d) => d.gold + d.silver + d.bronze,
        tip: true,
        padding: dodgeP,
        anchor: "top",
      })
    ),
  ],
});

Reference โ€‹

js
sex = (d) => (d.gender === 0 ? "Male" : "Female");
js
data = d3.csv("/athletes.csv", d3.autoType);
js
Plot = d3.require("@observablehq/plot");

Released under the MIT License.