The Graph Centrality Trio: Degree, Betweenness, and Closeness - ACERS Engineering Analysis

Subtitle / Abstract
Centrality is not just a paper concept. In graph systems, it is a practical node-importance ranking engine. This article follows the ACERS structure to explain Degree / Betweenness / Closeness and gives one pragmatic conclusion: most online systems reliably support only Degree + approximate Betweenness.

Estimated reading time: 12-16 minutes
Tags: Graph Theory, Centrality, Degree, Betweenness, Closeness
SEO keywords: graph centrality, Degree Centrality, Betweenness, Closeness, approximate Betweenness
Meta description: Engineering guide to graph centrality: definitions, complexity, approximation methods, and production strategies, with runnable code.

Target Audience

Engineers working on relationship graph analysis, knowledge graphs, or graph-database query optimization
Developers who need to turn “node importance” from concept into production metric
Practitioners who want to understand why Betweenness is expensive in production and how to approximate it

Background / Motivation

In graph systems, you will eventually face questions like these:

Which nodes are social influencers or transaction hubs?
Which nodes are key bridges whose removal significantly fragments the graph?
Which nodes are globally closer to others and better suited as entry points or cache hotspots?

These map directly to centrality metrics:

Degree Centrality: how many connections a node has (local importance)
Betweenness Centrality: whether a node lies on many shortest paths (bridge importance)
Closeness Centrality: whether a node has shorter average distance to the full graph (global proximity)

In practice, the biggest challenge is not definition - it is compute cost:

Degree is very cheap and almost always supports real-time use
Exact Betweenness is expensive and is usually offline or approximate
Closeness requires many shortest-path computations and quickly becomes hard to run online on large graphs

Core Concepts

1) Degree Centrality

For node v in an undirected graph, Degree centrality is commonly written as:

C_D(v) = deg(v) / (n - 1)

Meaning: local connectivity activity of the node.

2) Betweenness Centrality

C_B(v) = Σ_{s≠v≠t} (σ_st(v) / σ_st)

σ_st: number of shortest paths from s to t
σ_st(v): number of those shortest paths that pass through v

Meaning: mediation power of the node as a channel or bridge.

3) Closeness Centrality

C_C(v) = (n - 1) / Σ_{u≠v} d(v, u)

Meaning: how close the node is, overall, to all other nodes in the graph.

Practical note: disconnected graphs often use harmonic closeness to avoid denominator issues caused by unreachable nodes.

A — Algorithm (Problem and Algorithm)

Problem Restatement (Engineering Version)

Given graph G=(V,E), compute centrality scores for each node and return Top-K nodes:

Degree centrality
Betweenness centrality (approximation allowed)
Closeness centrality (or harmonic variant)

Input/Output

Name	Type	Description
graph	adjacency list	`graph[u] = [v1, v2, ...]` (unweighted)
k	int	number of Top-K outputs
mode	str	`degree` / `betweenness` / `closeness`
return	List[(node, score)]	sorted node scores

Example 1 (Small Graph)

A-B-C-D and B-E

Intuition:
- B has high degree -> high Degree
- B/C lie on many shortest paths -> high Betweenness
- B/C are closer on average to other nodes -> higher Closeness

Example 2 (Bridge Node)

Two clusters are connected through X

X usually has very high Betweenness, even if its Degree is not the highest

Reasoning Path (From Naive to Production)

Naive Approach

Compute shortest paths for every pair of nodes, then count how often each node is traversed
Complexity is too high for large graphs

Key Observations

Degree uses only local adjacency and is near linear complexity
Betweenness can be significantly optimized with Brandes, but remains relatively expensive
Closeness fundamentally needs many-source shortest paths, so cost rises rapidly with graph size

Engineering Decisions

Online: prioritize Degree, add sampled approximate Betweenness when necessary
Offline batch: compute fuller Betweenness / Closeness
Large graphs: use Top-K + sampling + layered cache together

C — Concepts (Core Ideas)

Method Categories

Degree: local statistics
Betweenness: dependency accumulation over global shortest paths
Closeness: global distance aggregation

Complexity Intuition (Unweighted Graph)

Metric	Common Algorithm	Rough Complexity
Degree	traverse adjacency list	`O(V+E)`
Betweenness	Brandes	`O(VE)`
Closeness	run BFS from every node	`O(V(V+E))`

Practical Conclusion (Key Point)

Most online systems reliably support only Degree + approximate Betweenness.
Closeness is often moved offline or computed only on small subgraphs.

The reasons are straightforward:

Degree is low-cost, highly interpretable, and easy to update incrementally
Exact Betweenness is too expensive, while approximation is controllable
Closeness is sensitive to connectivity and graph size, making online SLA hard to guarantee

Practical Guide / Steps

Step 1: Define the Business Question First

Looking for highly connected nodes: Degree
Looking for critical bridges: Betweenness
Looking for global proximity centers: Closeness

Step 2: Choose Online or Offline

Online services: Degree + approximate Betweenness
Offline reporting: add Closeness / refined Betweenness

Step 3: Runnable Python Baseline

from collections import deque
import random


def degree_centrality(graph):
    n = max(len(graph), 1)
    return {u: len(graph.get(u, [])) / max(n - 1, 1) for u in graph}


def bfs_dist(graph, s):
    dist = {s: 0}
    q = deque([s])
    while q:
        u = q.popleft()
        for v in graph.get(u, []):
            if v not in dist:
                dist[v] = dist[u] + 1
                q.append(v)
    return dist


def closeness_centrality(graph):
    n = len(graph)
    cc = {}
    for u in graph:
        d = bfs_dist(graph, u)
        if len(d) <= 1:
            cc[u] = 0.0
            continue
        s = sum(d.values())
        cc[u] = (len(d) - 1) / s if s > 0 else 0.0
        # Can be switched to harmonic closeness depending on the use case
    return cc


def approx_betweenness_by_sampling(graph, samples=8, seed=0):
    random.seed(seed)
    nodes = list(graph.keys())
    if not nodes:
        return {}

    score = {u: 0.0 for u in nodes}
    sample_sources = random.sample(nodes, min(samples, len(nodes)))

    for s in sample_sources:
        # Single-source shortest-path DAG + dependency back-propagation (Brandes style)
        stack = []
        pred = {v: [] for v in nodes}
        sigma = {v: 0.0 for v in nodes}
        dist = {v: -1 for v in nodes}

        sigma[s] = 1.0
        dist[s] = 0
        q = deque([s])

        while q:
            v = q.popleft()
            stack.append(v)
            for w in graph.get(v, []):
                if dist[w] < 0:
                    dist[w] = dist[v] + 1
                    q.append(w)
                if dist[w] == dist[v] + 1:
                    sigma[w] += sigma[v]
                    pred[w].append(v)

        delta = {v: 0.0 for v in nodes}
        while stack:
            w = stack.pop()
            for v in pred[w]:
                if sigma[w] > 0:
                    delta[v] += (sigma[v] / sigma[w]) * (1.0 + delta[w])
            if w != s:
                score[w] += delta[w]

    # Sampling normalization (approximation)
    factor = len(nodes) / max(len(sample_sources), 1)
    return {u: score[u] * factor for u in nodes}


if __name__ == "__main__":
    g = {
        "A": ["B"],
        "B": ["A", "C", "E"],
        "C": ["B", "D"],
        "D": ["C"],
        "E": ["B"],
    }

    print("degree", degree_centrality(g))
    print("closeness", closeness_centrality(g))
    print("approx_betweenness", approx_betweenness_by_sampling(g, samples=3, seed=42))

E — Engineering (Engineering Applications)

Scenario 1: Anti-Fraud Hub Account Detection (Degree)

Background: in money-transfer graphs, highly connected accounts are often transfer hubs.
Why this fits: Degree is fast to compute and suitable as an online risk-control feature.

# online feature: out-degree / in-degree
risk_score = out_degree * 0.6 + in_degree * 0.4

Scenario 2: Critical Bridge Node Alerts (Approximate Betweenness)

Background: in social or transaction graphs, some nodes are the only channel between communities.
Why this fits: Betweenness finds bridges, but exact computation is expensive; sampled approximation is easier to deploy.

// pseudo-go style: run sampled Brandes in batch job
// 1) sample K sources
// 2) accumulate dependency scores
// 3) write top-k bridge nodes to Redis/OLAP

Scenario 3: Entry Selection for Path Explanation (Closeness)

Background: explanation systems may prefer to start path rendering from nodes that are globally closer to core regions.
Why this fits: Closeness captures nodes with shorter average distances.

// Use top-N offline closeness nodes as explanation entry candidates
const candidates = centralityRank.slice(0, N);

R — Reflection (Reflection and Deeper Analysis)

Exact vs Approximate

Metric	Exact Cost	Approximation Strategy	Engineering Recommendation
Degree	Low	Not required	Compute online directly
Betweenness	High	Source-node sampling, Top-K estimation	Read offline results online / batch refresh
Closeness	Medium-high	Subgraph computation, harmonic variant	Mostly used for offline analysis

Common Wrong Approaches

Treating Betweenness as a fully real-time online metric
Applying standard Closeness directly to large disconnected graphs without variant handling
Ignoring directed vs undirected differences, causing interpretation errors

Why “Degree + Approximate Betweenness” Is Most Common

Controllable cost: can satisfy online SLA
Strong interpretability: easy for product and business teams to reason about
Easy evolution path: launch a usable version first, then add refined offline metrics

Explanation and Principles (Why This Works)

The engineering essence of centrality is extracting stable, interpretable importance signals at acceptable cost.

Degree gives local activity
Betweenness gives bridge-control power
Closeness gives global proximity

In real systems, the question is not “which metric is theoretically best,” but “which metric is sustainable under current scale and latency budget.”

Frequently Asked Questions and Notes

Can directed and undirected graphs use the same formulas?
The high-level idea is shared, but counting conventions differ (in-degree/out-degree, shortest-path direction).
Does Betweenness have to be exact?
Not necessarily. In many cases, approximate ranking is enough, especially for Top-K.
How should Closeness be handled on disconnected graphs?
Harmonic closeness is recommended, or restrict computation to connected subgraphs.
Do centrality scores need real-time updates?
Most systems use “offline batch refresh + online cache.” Only Degree is usually feasible for lightweight real-time increment.

Best Practices and Recommendations

Split centrality into two layers: offline primary computation + online feature service
Start from business questions, then pick metrics; avoid a “metric-first” mindset
Put budgets on Betweenness: sample size, time window, Top-K-only outputs
For large graphs, split by connected components first to avoid indiscriminate full-graph computation

S — Summary (Summary)

Core Takeaways

Degree, Betweenness, and Closeness capture local connectivity, bridge mediation, and global proximity
Betweenness is expensive in production; exact full computation is usually unsuitable online
The pragmatic combination in most systems is Degree + approximate Betweenness
Closeness is better for offline analysis or small subgraphs
Metric choice must obey constraints of scale, latency, and interpretability

Metadata

Reading time: 12-16 minutes
Tags: Graph Theory, Centrality, Degree, Betweenness, Closeness
SEO keywords: graph centrality, Degree Centrality, Betweenness, Closeness, approximate Betweenness
Meta description: Engineering guide to the graph centrality trio: definitions, complexity, approximation, and rollout strategy, with a focus on why most systems only support Degree and approximate Betweenness.

Call To Action (CTA)

A practical next step is to do two things:

Launch Degree + Top-K first to validate business interpretability
Add an offline sampled-Betweenness job and compare ranking stability

If you want, I can write the next engineering continuation on “PageRank + Community Detection (Louvain).”

Multi-language Reference Implementations (Python / C / C++ / Go / Rust / JS)

# Degree centrality (unweighted graph)
def degree_centrality(graph):
    n = max(len(graph), 1)
    return {u: len(graph.get(u, [])) / max(n - 1, 1) for u in graph}

/* degree centrality for adjacency matrix (undirected) */
#include <stdio.h>
#define N 5

int main(void) {
    int g[N][N] = {
        {0,1,0,0,1},
        {1,0,1,0,0},
        {0,1,0,1,0},
        {0,0,1,0,0},
        {1,0,0,0,0}
    };
    for (int i = 0; i < N; i++) {
        int deg = 0;
        for (int j = 0; j < N; j++) deg += g[i][j];
        double cd = (double)deg / (N - 1);
        printf("node %d degree_c=%.3f\n", i, cd);
    }
    return 0;
}

#include <bits/stdc++.h>
using namespace std;

int main() {
    vector<vector<int>> g = {
        {1,4}, {0,2}, {1,3}, {2}, {0}
    };
    int n = (int)g.size();
    for (int u = 0; u < n; ++u) {
        double c = (double)g[u].size() / max(n - 1, 1);
        cout << "node " << u << " degree_c=" << c << "\n";
    }
}

package main

import "fmt"

func main() {
	g := map[int][]int{0: {1,4}, 1: {0,2}, 2: {1,3}, 3: {2}, 4: {0}}
	n := len(g)
	for u, nbrs := range g {
		cd := float64(len(nbrs)) / float64(n-1)
		fmt.Printf("node %d degree_c=%.3f\n", u, cd)
	}
}

use std::collections::HashMap;

fn main() {
    let mut g: HashMap<i32, Vec<i32>> = HashMap::new();
    g.insert(0, vec![1, 4]);
    g.insert(1, vec![0, 2]);
    g.insert(2, vec![1, 3]);
    g.insert(3, vec![2]);
    g.insert(4, vec![0]);

    let n = g.len() as f64;
    for (u, nbrs) in &g {
        let cd = nbrs.len() as f64 / (n - 1.0);
        println!("node {} degree_c={:.3}", u, cd);
    }
}

const g = new Map([
  ["A", ["B", "E"]],
  ["B", ["A", "C"]],
  ["C", ["B", "D"]],
  ["D", ["C"]],
  ["E", ["A"]],
]);

const n = g.size;
for (const [u, nbrs] of g.entries()) {
  const cd = nbrs.length / (n - 1);
  console.log(u, "degree_c=", cd.toFixed(3));
}

Target Audience#

Background / Motivation#

Core Concepts#

1) Degree Centrality#

2) Betweenness Centrality#

3) Closeness Centrality#

A — Algorithm (Problem and Algorithm)#

Problem Restatement (Engineering Version)#

Input/Output#

Example 1 (Small Graph)#

Example 2 (Bridge Node)#

Reasoning Path (From Naive to Production)#

Naive Approach#

Key Observations#

Engineering Decisions#

C — Concepts (Core Ideas)#

Method Categories#

Complexity Intuition (Unweighted Graph)#

Practical Conclusion (Key Point)#

Practical Guide / Steps#

Step 1: Define the Business Question First#

Step 2: Choose Online or Offline#

Step 3: Runnable Python Baseline#

E — Engineering (Engineering Applications)#

Scenario 1: Anti-Fraud Hub Account Detection (Degree)#

Scenario 2: Critical Bridge Node Alerts (Approximate Betweenness)#

Scenario 3: Entry Selection for Path Explanation (Closeness)#

R — Reflection (Reflection and Deeper Analysis)#

Exact vs Approximate#

Common Wrong Approaches#

Why “Degree + Approximate Betweenness” Is Most Common#

Explanation and Principles (Why This Works)#

Frequently Asked Questions and Notes#

Best Practices and Recommendations#

S — Summary (Summary)#

Core Takeaways#

Recommended Further Reading#

Metadata#

Call To Action (CTA)#

Multi-language Reference Implementations (Python / C / C++ / Go / Rust / JS)#