A Mexican bank's loan book is a table. Each row is a borrower, an amount, a status. The rows don't know about each other. The borrowers do. They guarantee each other's loans, sit on each other's boards, own pieces of the same companies. This page walks through four risks that are invisible in the table and obvious in the graph.
This is the view a risk analyst opens every morning: 458 loans, a healthy headline NPL, five sectors. Everything an aggregate can tell you is here. What it can't tell you is who is connected to whom.
| loan_id | client | balance | status | score |
|---|---|---|---|---|
| PRE-00313 | CLI-00010 | $87,200 | ACTIVO | 725.8 |
| PRE-00127 | CLI-00011 | $132,511 | ACTIVO | 798.2 |
| PRE-00254 | CLI-00012 | $132,511 | VENCIDO | 649.6 |
| PRE-00087 | CLI-00020 | $50,922 | ACTIVO | 710.1 |
| PRE-00308 | CLI-00021 | $9,200 | ACTIVO | 695.3 |
| ... 453 more rows | ||||
Looks like a healthy portfolio. One past-due loan in five shown. Nothing unusual.
Homero has a score of 725 and looks fine. But he guarantees Manuel's loan. Manuel guarantees Cecilia's loan. Cecilia's loan is 45 days past due. Homero's score never saw this coming.
Four questions a loan table cannot answer. Not because SQL is broken. Because the questions are about connections, and a flat table has no language for them.
Homero Aguayo has a bureau score of 725. Well above the threshold. His own loan is current, no past-due days. Every standard metric says he's fine.
The graph says: Homero guarantees Manuel's loan. Manuel guarantees Cecilia's loan. Cecilia's loan is 45 days past due.
If Cecilia can't pay, Manuel is liable. If Manuel can't cover it, Homero is liable. His credit score was answering a different question: one about Homero in isolation, not Homero in context.
In 2008, AIG's guarantee chains on CDO tranches worked exactly this way. The depth of exposure was unknowable until the underlying asset failed.
hop count hardcoded when query is written
chain 3 hops long? this query doesn't exist
SELECT c.id_cliente, c.score_buro
FROM clientes c
JOIN garantias g1 ON c.id_cliente = g1.id_garante
JOIN prestamos p1 ON g1.id_prestamo = p1.id_prestamo
JOIN clientes c2 ON p1.id_titular = c2.id_cliente
JOIN garantias g2 ON c2.id_cliente = g2.id_garante
JOIN prestamos p2 ON g2.id_prestamo = p2.id_prestamo
WHERE p2.estatus = 'VENCIDO'
MATCH (a:ClienteIndividual {
id_cliente: 'CLI-00010'
})-[:GARANTIZA]->(p1)
<-[:TIENE_PRESTAMO]-(b)
-[:GARANTIZA]->(p2)
<-[:TIENE_PRESTAMO]-(c)
WHERE p2.estatus = 'VENCIDO'
RETURN
a.score_buro AS guarantor_score,
p2.saldo_vigente_mxn AS exposure,
p2.dias_mora
*1..N) is native in Cypher. SQL requires one JOIN per hop, and the hop count must be decided when writing the query, before you know how deep the exposure actually goes.Three borrowers guarantee each other's loans in a closed loop: A guarantees B's loan, B guarantees C's loan, C guarantees A's loan.
A risk system checking "does this loan have a guarantor?" returns YES for all three. They look fully covered.
But the coverage is fiction. If any one of them can't pay, the guarantee activates for someone already exposed to the same failure. Nobody outside the loop is backing anything.
This is a regulatory violation. CNBV Circular 3/2012 prohibits guarantee structures where coverage is fictitious due to circular dependency.
postgres only. MySQL breaks here.
no cycle guard: stack overflow on real data
you still don't know what the exposure is
WITH RECURSIVE chain AS (
SELECT g.id_garante,
ARRAY[g.id_garante] AS visited
FROM garantias g
UNION ALL
SELECT g2.id_garante,
visited || g2.id_garante
FROM garantias g2, chain
WHERE NOT g2.id_garante = ANY(visited)
)
found the cycle. now what?
loan balances are in another table. start over.
MATCH (c1)-[:GARANTIZA]->(p1)
<-[:TIENE_PRESTAMO]-(c2)
-[:GARANTIZA]->(p2)
<-[:TIENE_PRESTAMO]-(c3)
-[:GARANTIZA]->(p3)
<-[:TIENE_PRESTAMO]-(c1)
WHERE c1 <> c2
AND c2 <> c3
AND c1 <> c3
RETURN c1, c2, c3,
p1.saldo_vigente_mxn,
p2.saldo_vigente_mxn,
p3.saldo_vigente_mxn
HARI MASA SA DE CV is under restructuring. In the loan table, that's one row. In the graph, it's a hub.
Five shareholders have personal loans with the same bank. Two cross-guarantee each other's personal loans. If HARI MASA collapses, all five lose their equity stake simultaneously. The two cross-guarantors are doubly exposed.
SQL sees five independent borrowers. The graph sees one interconnected risk cluster where the company's failure is a single correlated event affecting all five at once.
Bear Stearns in 2008 wasn't the largest bank. It was the most interconnected. Size doesn't predict damage. Topology does.
table names baked in, not discoverable
"all connected entities" has no SQL equivalent
SELECT c.nombre, p.saldo_vigente_mxn
FROM empresas e
JOIN accionistas a
ON e.id_empresa = a.id_empresa
JOIN clientes c
ON a.id_cliente = c.id_cliente
LEFT JOIN prestamos p
ON c.id_cliente = p.id_titular
LEFT JOIN garantias g
ON c.id_cliente = g.id_garante
WHERE e.id_empresa = 'EMP-00031'
MATCH (e:Empresa {
id_empresa: 'EMP-00031'
})<-[:ES_ACCIONISTA_DE|ES_DIRECTOR_DE]-(c)
OPTIONAL MATCH (c)-[:TIENE_PRESTAMO]->(p)
OPTIONAL MATCH (c)-[:GARANTIZA]->(g)
RETURN
c.nombre_completo,
sum(p.saldo_vigente_mxn) AS personal,
sum(g.saldo_vigente_mxn) AS guaranteed
Emilio Maldonado appears in four separate loan files as a controlling shareholder. The portfolio report shows exposure spread across four companies in three sectors. It looks diversified.
The graph shows: it's one person. Score 583. Personal income $18,784 MXN/month. If he has a legal problem, a health crisis, or makes one bad strategic decision, all four companies lose their key decision-maker at the same moment.
Entity-level diversification is an illusion when control is concentrated. SQL sees four borrowers. The graph sees one person.
Lehman Brothers had 7,000 legal entities across 40 countries. The portfolio looked diversified. When the parent failed, all 7,000 failed at once. This is why the G20 mandated the creation of GLEIF in 2011.
finds the count. not the exposure.
two more JOINs for that. which you haven't written.
SELECT a1.id_cliente,
COUNT(DISTINCT a1.id_empresa) AS n
FROM accionistas a1
JOIN accionistas a2
ON a1.id_cliente = a2.id_cliente
AND a1.id_empresa <> a2.id_empresa
GROUP BY a1.id_cliente
HAVING COUNT(DISTINCT a1.id_empresa) >= 2
still 2 JOINs short of knowing the actual risk
MATCH (c:ClienteIndividual)
-[:ES_ACCIONISTA_DE|ES_DIRECTOR_DE]->(e)
WITH c, count(e) AS n
WHERE n >= 2
MATCH (c)-[:ES_ACCIONISTA_DE]->(e2)
-[:TIENE_PRESTAMO]->(p)
RETURN
c.nombre_completo, n,
sum(p.saldo_vigente_mxn)
AS true_concentration
ORDER BY true_concentration DESC
No orchestrator. Each script reads the output of the previous one. Real data sources for ownership topology, synthetic credit layer calibrated to CNBV benchmarks.
These numbers aren't auditable facts. They come from synthetic distributions calibrated to CNBV benchmarks. Each finding is better read as: this kind of exposure exists in real portfolios, and here is how you'd find it.
The difference between this prototype and production is data source and scale, not logic. The UK PSC topology was chosen precisely because real ownership structures (shared directors, nested ownership) cannot be replicated by random generators. You need real corporate control relationships to make graph queries meaningful.
A small slice of the live graph: 29 of the 853 nodes, chosen because each one participates in at least one of the four patterns below. Drag a node, hover for details, click a finding to see just its subgraph.