Completes the three-way BIP-39 mnemonic surface (Rust + Node landed in
0058d9b) and pins down byte-for-byte agreement with crosstest scenarios.
Python (mirrors rust/crates/kez-core/src/mnemonic.rs + nodejs's mnemonic.ts):
• python/kez/mnemonic.py — generate_mnemonic, seed_from_mnemonic,
mnemonic_from_seed_24, ed25519_from_mnemonic,
generate_ed25519_with_mnemonic. Same 24-word-bijection / 12-word-
SHA-256-domain-tagged semantics. Uses Trezor's `mnemonic` library
(v0.21) for the BIP-39 wordlist + entropy parsing; deliberately does
NOT use BIP-39's PBKDF2 to_seed function.
• python/kez/keys.py — Ed25519Secret.from_mnemonic() +
generate_with_mnemonic() classmethods; signer_from_flags widened to
accept --mnemonic.
• python/kez/cli.py — identity new --mnemonic-words, identity
mnemonic [--words], identity from-mnemonic; --mnemonic flag on
claim create/dns and sigchain add/revoke/show/export. Output format
matches Rust + Node verbatim so the crosstest harness can grep
Primary/Public/Secret/Mnemonic lines.
• python/tests/test_mnemonic.py — 19 tests covering all three
canonical vectors (exact-match Secret + Public hex), round-trip,
determinism, whitespace tolerance, bad-checksum, bad-word-count,
the literal domain-tag bytes, and the 12-vs-24 entropy-overlap
non-collision case.
Note: --mnemonic is NOT added to `sigchain publish` because that
subcommand doesn't exist in the Python CLI yet (rust + node only). When
the publish surface is ported, --mnemonic should follow it the same way.
Ground truth — python/MNEMONIC-TEST-VECTORS.md:
V1: 24-word zero-entropy phrase ("abandon… art")
seed = 0000…0000
pubkey = 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
V2: 12-word zero-entropy phrase ("abandon… about")
seed = 09451c0f06588db78205e32a793536e15ae263c8f9ee6d14f5c6fd82b8bd20da
pubkey = 9403c32e0d3b4ce51105c0bcac09a0d73be0cca98a6bf7b3cd434651be866d70
V3: 12-word "legal winner thank year wave sausage worth useful legal winner thank yellow"
seed = 9df434a2bd5dc767ee949d8ab95ca09c4ebbb88cefc3d0b1523f6b2a744ca824
pubkey = cc99d06b15ccb83a5ca43f25dd3d27f50638c1c6fbe3a822352da3e07156ce03
The domain tag for the 12-word derivation is exactly the 15 ASCII
bytes of "kez-bip39-12-v1", documented in the spec doc.
crosstest.sh — new "BIP-39 mnemonic interop" section:
• Vector match: each impl × each vector × Public hex == expected (9
scenarios). Catches any silent derivation drift.
• Cross-impl claim signing via --mnemonic: every signer ↔ verifier
pair (rust↔node, rust↔py, node↔py), every format (json/compact/
markdown). 6 pairings × 3 formats = 18 scenarios.
• Bijection sanity: the 24-word phrase printed by `identity from-
mnemonic` round-trips to itself byte-for-byte (rust + node).
• Python-involving scenarios auto-skip if `python/.venv/bin/python
kez_cli.py identity from-mnemonic` returns non-zero, so the harness
stays runnable on machines where Python isn't set up.
Verified end-to-end: `bash crosstest.sh` reports
"All 84 scenarios passed."
Test totals across implementations:
Rust: 114 (9 mnemonic-specific in kez-core)
Node: 99 (8 mnemonic-specific in @kez/core)
Python: 19 (mnemonic only; was no test suite before)
Crosstest: 84 scenarios end-to-end
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
64 lines
2.4 KiB
Markdown
64 lines
2.4 KiB
Markdown
# KEZ Mnemonic — canonical test vectors
|
|
|
|
These vectors are ground truth that **all three implementations
|
|
(Rust, Node, Python) MUST match byte-for-byte**. Generated from
|
|
the Rust and Node implementations, which have already been verified
|
|
to agree (see `mnemonics` branch commit `0058d9b`).
|
|
|
|
## Semantics
|
|
|
|
- **24-word phrase** → entropy IS the 32-byte Ed25519 seed (bijection).
|
|
- **12-word phrase** → 16-byte entropy → 32-byte seed via
|
|
`SHA-256("kez-bip39-12-v1" || entropy)`.
|
|
Domain tag bytes: `0x6b, 0x65, 0x7a, 0x2d, 0x62, 0x69, 0x70, 0x33, 0x39, 0x2d, 0x31, 0x32, 0x2d, 0x76, 0x31` (15 bytes, UTF-8 of "kez-bip39-12-v1").
|
|
|
|
Wordlist: BIP-39 English (the canonical 2048-word list).
|
|
|
|
## Vectors
|
|
|
|
### V1 — 24-word, all-zero entropy
|
|
|
|
```
|
|
phrase: abandon abandon abandon abandon abandon abandon abandon abandon
|
|
abandon abandon abandon abandon abandon abandon abandon abandon
|
|
abandon abandon abandon abandon abandon abandon abandon art
|
|
seed: 0000000000000000000000000000000000000000000000000000000000000000
|
|
pubkey: 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
|
|
```
|
|
|
|
### V2 — 12-word, all-zero entropy
|
|
|
|
```
|
|
phrase: abandon abandon abandon abandon abandon abandon abandon abandon
|
|
abandon abandon abandon about
|
|
seed: 09451c0f06588db78205e32a793536e15ae263c8f9ee6d14f5c6fd82b8bd20da
|
|
pubkey: 9403c32e0d3b4ce51105c0bcac09a0d73be0cca98a6bf7b3cd434651be866d70
|
|
```
|
|
|
|
### V3 — 12-word, non-trivial entropy
|
|
|
|
```
|
|
phrase: legal winner thank year wave sausage worth useful legal winner
|
|
thank yellow
|
|
seed: 9df434a2bd5dc767ee949d8ab95ca09c4ebbb88cefc3d0b1523f6b2a744ca824
|
|
pubkey: cc99d06b15ccb83a5ca43f25dd3d27f50638c1c6fbe3a822352da3e07156ce03
|
|
```
|
|
|
|
## What "pubkey" means here
|
|
|
|
`pubkey` is the 32-byte Ed25519 public key (hex) derived from the seed
|
|
above via the standard Ed25519 keypair derivation (the same as
|
|
`ed25519-dalek` / `@noble/curves/ed25519`). The KEZ identity string is
|
|
`ed25519:<pubkey>`.
|
|
|
|
## Implementation crib
|
|
|
|
Both Rust and Node load the **raw entropy** from the BIP-39 phrase
|
|
(not the BIP-39 PBKDF2-derived 64-byte seed). 24-word entropy is 32
|
|
bytes and is used directly as the seed. 12-word entropy is 16 bytes
|
|
and is hashed once with the domain tag to produce the 32-byte seed.
|
|
|
|
This deliberately differs from how hardware wallets use the same
|
|
phrases (which feed the PBKDF2 64-byte seed into BIP-32 derivation).
|
|
KEZ has one identity per phrase, no derivation tree.
|