Toward Solving Micro Shogi

Micro Shogi is a 4×5 drop-shogi variant with five pieces a side: King, Gold, Silver, Bishop, and Pawn. By board size it sits above Dōbutsu Shōgi and Shogi4, and below Minishogi.

The full game is still unsolved. What I have now is a measured run that makes the solve budget concrete: the K+P+G reduced game, with King, Pawn, and Gold per side, is a draw from the start. It has 869,287,068 canonical reachable positions, a maximum distance-to-mate of 155 plies, and it just barely fit in the first 64 GB cloud run.

The number can look modest next to Shogi4’s 2.1-billion-position closed run. KPG is smaller: it removes Silver and Bishop. The point of the run is that this is where Micro Shogi’s convenient solver already hit the memory wall. A Shogi4-style dense-rank solver reproduced the same result with ~6.86 GiB and finished faster.

For scale:

Game or run	Status	Size
Dōbutsu Shōgi	full game solved	246,803,167 reachable positions
Micro Shogi KPG	reduced game solved	869,287,068 reachable positions
Shogi4 largest closed run	reduced game solved	2,100,849,024 positions
Full Micro Shogi	estimated, unrun	~5×10¹⁴ reachable positions

Try the rules directly below. The viewer enforces legal moves, capture-flip promotion, either-face drops, and king-capture terminal positions. It does not show tablebase values.

Open standalone →

The game

Each side starts with S G B K on the home rank, with a pawn in front of the king. Pieces move like their shogi equivalents: King, Gold, Silver, Bishop, and Pawn, plus the reverse faces Rook, Lance, Tokin, and Knight.

Micro Shogi has no promotion zone. Instead, every non-king piece flips when it captures. The pairs are Gold/Rook, Silver/Lance, Bishop/Tokin, and Pawn/Knight.

Captured pieces enter the capturer’s hand. A hand piece may be dropped on any empty square with either face up. There is no nifu, no pawn-drop-mate ban, and no last-rank restriction.

The sources I found define the win condition as checkmate. The solver models that as king capture, the same terminal convention I used for Dōbutsu. Repetition is still a rules gap in the sources, so unresolved cycles are draws in this model.

The measured runs

Reduced-piece Micro Shogi games are useful, but they are not endgame slices of the full game. Drops recycle captured material, so a K+P+G solve is a smaller sibling game with the same board and rules machinery.

Here is the current calibration ladder:

Rung	Pieces per side	Canonical reachable	Start	Wins / losses / draws	Max DTM	Avg branching
KP	K, P	457,993	draw	135,804 / 2,956 / 319,233	29	6.86
KPG	K, P, G	869,287,068	draw	606,922,331 / 142,074,547 / 120,290,190	155	11.79

The KPG run was the first Micro Shogi rung large enough to expose the real scaling problem. I originally expected something closer to 130 million positions. The completed count was 869 million. The first attempt also found a concrete bug: predecessor offsets were stored as u32, and the reverse graph had crossed the 2³²-edge boundary.

After fixing that, the stored-CSR run completed on a 64 GB Hetzner box:

Quantity	Value
Machine	16 vCPU, 61 GiB RAM, 64 GiB swap
Elapsed wall time	4:21:08
Core solve time	3:26:25
Peak RSS	~60.5 GiB
Total legal moves	10,250,756,260
Propagated resolved-child edges	4,567,032,875
Propagation rate	92.7 ns/edge
Raw table dump	17,385,741,380 bytes
Compressed table backup	2,181,879,983 bytes
Audit	PASS

The table is drawish but active. About 69.82% of positions are wins for the side to move, 16.34% are losses, and 13.84% are draws. Most decided positions finish quickly: median DTM is 2, p90 is 16, p99 is 54, and only 11 winning positions reach DTM 155.

The dense-rank result

The 64 GB run was a useful calibration, but the more important result was the rerun. It changed the method from “store every reachable predecessor edge” to “rank positions densely and generate predecessors on demand.”

That solver enumerates reachable positions, stores a HashMap from position key to dense id, and stores a compressed sparse row list of predecessor edges. It is easy to validate and fast while it fits in memory. It becomes the problem once the graph is billions of edges.

The Shogi4 design uses the scalable shape: rank a position directly into an integer slot, store flat arrays, and generate predecessors on demand by undoing moves. That spends slots on unreachable arrangements and spends CPU on rank/unrank, but avoids the giant reachable-key map and the stored reverse graph.

I reran KPG with a KPG-specific dense ranker and an on-demand predecessor generator:

Method	Index	Predecessors	Peak memory	Time
Reachable + CSR	869M reachable ids	Stored reverse graph	~60.5 GiB	4:21:08 full run
Dense rank, first pass	2.04B rank slots	Generated on demand, with mirror duplicates	~6.86 GiB	3:31:48 total
Dense rank, mirror-aware	2.04B rank slots	Generated on demand, dedup-free	~6.86 GiB	2:21:04 total

The dense rank domain is 2,037,557,340 slots, or 2.34× the reachable KPG count. That is the trade: the table contains holes, but the solver no longer carries the reverse graph.

Both dense runs matched the audited CSR values. The first dense pass also caught a smaller efficiency bug: it generated both mirror orientations for every child, canonicalized them, and threw half away. The mirror-aware rerun removed 8.16 billion duplicate predecessor ids, cutting propagation from 9,229 seconds to 4,961 seconds.

The final dense run is the useful comparison point: same values, 8.8× less memory, and 1.46× faster than the CSR core solve on one core.

Full-game sizing

The exact all-arrangements upper bound for Micro Shogi is:

3,915,109,365,634,620

That count uses the same model Tanaka used for Dōbutsu: kings stay on the board, non-king pieces may be on the board or in hand, in-hand pieces have owner but no face, and on-board pieces have owner and face.

The reachable count is still an estimate. Bracketing from Dōbutsu and Minishogi gives roughly 3.0×10¹⁴ to 6.2×10¹⁴ reachable positions, with a point estimate around 5×10¹⁴.

The awkward part is the working index. A published tablebase can store only reachable positions after the solve. A scalable solver probably cannot build a reachable-only minimal perfect hash for ~5×10¹⁴ keys. The practical ranker spans the arrangement domain instead.

That gives two different budgets:

Basis	What it means	Minimum W/L/D table	Compute	Cost estimate
Reachable floor	Ideal reachable-only table	~134 TB	~150 core-years	~$10-15k bare metal
Arrangement rank	Practical dense-rank working domain	~1 PB	~660-1,200 core-years	~$40-70k bare metal, ~$150-280k cloud

The second row is the one I would plan around today. It is the same lesson the Shogi4 work forced into the open: “reachable positions” are the right number for the final compressed artifact, but dense tablebase construction usually pays for a larger rank domain while it computes.

In wall-clock terms, the arrangement-rank estimate is hundreds of years on one core, around 8-14 months on 1,000 cores, or roughly 3-6 weeks on 10,000 cores, before adding engineering overhead and validation reruns. The memory number is the flat value table, not the entire cluster footprint. A real run also needs counters, queues, checkpoints, and shuffle space.

That is why Micro Shogi is a later target than Shogi4. It is small enough to cost, but large enough that a casual cloud run would be an expensive way to discover an architecture bug.

What the runs changed

Before KPG, the Micro Shogi plan was mostly a scaling argument from KP and Dōbutsu. After KPG, the key uncertainties are narrower:

The current reachable-HashMap + CSR solver can answer calibration questions through KPG, but it is marginal on 64 GB and stops here.
Dense rank plus on-demand predecessors is the right production shape for Micro Shogi, not just a Shogi4 detail.
The memory trade is measured: 60.5 GiB down to 6.86 GiB on the same KPG values.
Mirror handling matters enough to measure. Generating symmetric duplicate predecessors almost doubled propagation time in the first dense run.
The full solve should be budgeted against the arrangement-rank domain, not the reachable-position estimate.

The remaining work is straightforward to name and still substantial to do: confirm the rules against a primary source or an independent engine, replace the reachable-count bracket with an exact count, build bucketed dense rank/unrank for the full material set, implement full Micro Shogi unmove generation, and run a smaller distributed rehearsal before paying for the complete tablebase.

For now, the result is a calibrated status report: KPG is solved and audited, the scalable method has been tested on the same data, and the full Micro Shogi solve looks PB-scale rather than laptop-scale.

Resources:

Micro Shogi viewer: legal-move viewer embedded above.
brianhliou/micro-shogi: rules engine, calibration solver, state-space enumerator, and viewer.
Toward Solving Shogi4: the smaller 4×4 drop-shogi run and full-solve design.
Solving Dōbutsu Shōgi: the fully solved rung below.