feat(python,crosstest): mirror BIP-39 mnemonic to Python + add interop scenarios
Completes the three-way BIP-39 mnemonic surface (Rust + Node landed in
0058d9b) and pins down byte-for-byte agreement with crosstest scenarios.
Python (mirrors rust/crates/kez-core/src/mnemonic.rs + nodejs's mnemonic.ts):
• python/kez/mnemonic.py — generate_mnemonic, seed_from_mnemonic,
mnemonic_from_seed_24, ed25519_from_mnemonic,
generate_ed25519_with_mnemonic. Same 24-word-bijection / 12-word-
SHA-256-domain-tagged semantics. Uses Trezor's `mnemonic` library
(v0.21) for the BIP-39 wordlist + entropy parsing; deliberately does
NOT use BIP-39's PBKDF2 to_seed function.
• python/kez/keys.py — Ed25519Secret.from_mnemonic() +
generate_with_mnemonic() classmethods; signer_from_flags widened to
accept --mnemonic.
• python/kez/cli.py — identity new --mnemonic-words, identity
mnemonic [--words], identity from-mnemonic; --mnemonic flag on
claim create/dns and sigchain add/revoke/show/export. Output format
matches Rust + Node verbatim so the crosstest harness can grep
Primary/Public/Secret/Mnemonic lines.
• python/tests/test_mnemonic.py — 19 tests covering all three
canonical vectors (exact-match Secret + Public hex), round-trip,
determinism, whitespace tolerance, bad-checksum, bad-word-count,
the literal domain-tag bytes, and the 12-vs-24 entropy-overlap
non-collision case.
Note: --mnemonic is NOT added to `sigchain publish` because that
subcommand doesn't exist in the Python CLI yet (rust + node only). When
the publish surface is ported, --mnemonic should follow it the same way.
Ground truth — python/MNEMONIC-TEST-VECTORS.md:
V1: 24-word zero-entropy phrase ("abandon… art")
seed = 0000…0000
pubkey = 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
V2: 12-word zero-entropy phrase ("abandon… about")
seed = 09451c0f06588db78205e32a793536e15ae263c8f9ee6d14f5c6fd82b8bd20da
pubkey = 9403c32e0d3b4ce51105c0bcac09a0d73be0cca98a6bf7b3cd434651be866d70
V3: 12-word "legal winner thank year wave sausage worth useful legal winner thank yellow"
seed = 9df434a2bd5dc767ee949d8ab95ca09c4ebbb88cefc3d0b1523f6b2a744ca824
pubkey = cc99d06b15ccb83a5ca43f25dd3d27f50638c1c6fbe3a822352da3e07156ce03
The domain tag for the 12-word derivation is exactly the 15 ASCII
bytes of "kez-bip39-12-v1", documented in the spec doc.
crosstest.sh — new "BIP-39 mnemonic interop" section:
• Vector match: each impl × each vector × Public hex == expected (9
scenarios). Catches any silent derivation drift.
• Cross-impl claim signing via --mnemonic: every signer ↔ verifier
pair (rust↔node, rust↔py, node↔py), every format (json/compact/
markdown). 6 pairings × 3 formats = 18 scenarios.
• Bijection sanity: the 24-word phrase printed by `identity from-
mnemonic` round-trips to itself byte-for-byte (rust + node).
• Python-involving scenarios auto-skip if `python/.venv/bin/python
kez_cli.py identity from-mnemonic` returns non-zero, so the harness
stays runnable on machines where Python isn't set up.
Verified end-to-end: `bash crosstest.sh` reports
"All 84 scenarios passed."
Test totals across implementations:
Rust: 114 (9 mnemonic-specific in kez-core)
Node: 99 (8 mnemonic-specific in @kez/core)
Python: 19 (mnemonic only; was no test suite before)
Crosstest: 84 scenarios end-to-end
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
0058d9b421
commit
b0cc1a74a0
87
crosstest.sh
87
crosstest.sh
@ -443,6 +443,93 @@ for peer in node rust; do
|
||||
done
|
||||
rm -f "$PY_ED_FILE"
|
||||
|
||||
# ── BIP-39 Mnemonic interop ─────────────────────────────────────────────────
|
||||
# 12- and 24-word phrases must derive identical Ed25519 keys across all
|
||||
# implementations, and a claim signed with --mnemonic in one impl must
|
||||
# verify in the others. See python/MNEMONIC-TEST-VECTORS.md for the
|
||||
# definitive ground-truth vectors.
|
||||
printf "%sBIP-39 mnemonic interop:%s\n" "$YELLOW" "$RESET"
|
||||
|
||||
# Canonical test vectors. Public keys are the expected outputs that all
|
||||
# three implementations MUST agree on byte-for-byte. If any of these
|
||||
# values change, an implementation has a derivation bug.
|
||||
MNEMO_P24="abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon art"
|
||||
MNEMO_PUB_24="3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29"
|
||||
MNEMO_P12="abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about"
|
||||
MNEMO_PUB_12="9403c32e0d3b4ce51105c0bcac09a0d73be0cca98a6bf7b3cd434651be866d70"
|
||||
MNEMO_P12B="legal winner thank year wave sausage worth useful legal winner thank yellow"
|
||||
MNEMO_PUB_12B="cc99d06b15ccb83a5ca43f25dd3d27f50638c1c6fbe3a822352da3e07156ce03"
|
||||
|
||||
# Probe: does the Python CLI know about `identity from-mnemonic` yet?
|
||||
PY_HAS_MNEMONIC=0
|
||||
if [[ -x "$PYTHON_VENV" ]]; then
|
||||
if "${PYTHON_CLI[@]}" identity from-mnemonic "$MNEMO_P12" 2>/dev/null \
|
||||
| grep -q "^Public:"; then
|
||||
PY_HAS_MNEMONIC=1
|
||||
fi
|
||||
fi
|
||||
|
||||
# Helper: assert the impl derives the expected pubkey from a phrase.
|
||||
assert_pubkey() {
|
||||
local impl="$1" phrase="$2" expected="$3" title="$4"
|
||||
scenario "$title"
|
||||
local actual
|
||||
actual=$(run_cli "$impl" identity from-mnemonic "$phrase" 2>/dev/null \
|
||||
| awk -F': *' '/^Public:/ {print $2; exit}')
|
||||
if [[ "$actual" == "$expected" ]]; then ok; else
|
||||
bad "$title" "expected pubkey $expected, got $actual"
|
||||
fi
|
||||
}
|
||||
|
||||
# Vector matches per impl.
|
||||
for impl in rust node; do
|
||||
assert_pubkey "$impl" "$MNEMO_P24" "$MNEMO_PUB_24" "$impl: V1 24-word vector derives expected pubkey"
|
||||
assert_pubkey "$impl" "$MNEMO_P12" "$MNEMO_PUB_12" "$impl: V2 12-word vector derives expected pubkey"
|
||||
assert_pubkey "$impl" "$MNEMO_P12B" "$MNEMO_PUB_12B" "$impl: V3 12-word vector derives expected pubkey"
|
||||
done
|
||||
if [[ "$PY_HAS_MNEMONIC" -eq 1 ]]; then
|
||||
assert_pubkey py "$MNEMO_P24" "$MNEMO_PUB_24" "py: V1 24-word vector derives expected pubkey"
|
||||
assert_pubkey py "$MNEMO_P12" "$MNEMO_PUB_12" "py: V2 12-word vector derives expected pubkey"
|
||||
assert_pubkey py "$MNEMO_P12B" "$MNEMO_PUB_12B" "py: V3 12-word vector derives expected pubkey"
|
||||
else
|
||||
printf " %sskip%s %s\n" "$YELLOW" "$RESET" \
|
||||
"py vector checks (python CLI lacks identity from-mnemonic — port still in flight)"
|
||||
fi
|
||||
|
||||
# Cross-impl claim signing with --mnemonic. Each impl signs, each other
|
||||
# verifies. Uses the V3 phrase because it has non-trivial entropy.
|
||||
for fmt in json compact markdown; do
|
||||
claim_roundtrip "rust mnemonic ($fmt) ⇒ node verify" rust node "$fmt" --mnemonic "$MNEMO_P12B"
|
||||
claim_roundtrip "node mnemonic ($fmt) ⇒ rust verify" node rust "$fmt" --mnemonic "$MNEMO_P12B"
|
||||
if [[ "$PY_HAS_MNEMONIC" -eq 1 ]]; then
|
||||
claim_roundtrip "py mnemonic ($fmt) ⇒ rust verify" py rust "$fmt" --mnemonic "$MNEMO_P12B"
|
||||
claim_roundtrip "rust mnemonic ($fmt) ⇒ py verify" rust py "$fmt" --mnemonic "$MNEMO_P12B"
|
||||
claim_roundtrip "py mnemonic ($fmt) ⇒ node verify" py node "$fmt" --mnemonic "$MNEMO_P12B"
|
||||
claim_roundtrip "node mnemonic ($fmt) ⇒ py verify" node py "$fmt" --mnemonic "$MNEMO_P12B"
|
||||
fi
|
||||
done
|
||||
if [[ "$PY_HAS_MNEMONIC" -ne 1 ]]; then
|
||||
printf " %sskip%s %s\n" "$YELLOW" "$RESET" \
|
||||
"py mnemonic claim round-trips (port still in flight)"
|
||||
fi
|
||||
|
||||
# Bijection sanity: 24-word phrase ⇄ seed must be exact. Each impl must
|
||||
# produce the canonical phrase from a known 32-byte seed via the
|
||||
# mnemonic-from-seed path (we drive it indirectly via the printed output
|
||||
# of `identity from-mnemonic`).
|
||||
scenario "24-word phrase is canonical form of its seed (rust)"
|
||||
got=$("${RUST_CLI[@]}" identity from-mnemonic "$MNEMO_P24" 2>/dev/null \
|
||||
| awk -F': *' '/^Mnemonic .24 words/ { match($0, /"[^"]+"/); print substr($0, RSTART+1, RLENGTH-2); exit }')
|
||||
if [[ "$got" == "$MNEMO_P24" ]]; then ok; else
|
||||
bad "rust canonical-24" "round-trip phrase differs"
|
||||
fi
|
||||
scenario "24-word phrase is canonical form of its seed (node)"
|
||||
got=$("${NODE_CLI[@]}" identity from-mnemonic "$MNEMO_P24" 2>/dev/null \
|
||||
| awk -F': *' '/^Mnemonic .24 words/ { match($0, /"[^"]+"/); print substr($0, RSTART+1, RLENGTH-2); exit }')
|
||||
if [[ "$got" == "$MNEMO_P24" ]]; then ok; else
|
||||
bad "node canonical-24" "round-trip phrase differs"
|
||||
fi
|
||||
|
||||
printf "\n"
|
||||
if [[ $FAIL -eq 0 ]]; then
|
||||
printf "%sAll %d scenarios passed.%s\n" "$GREEN" "$PASS" "$RESET"
|
||||
|
||||
63
python/MNEMONIC-TEST-VECTORS.md
Normal file
63
python/MNEMONIC-TEST-VECTORS.md
Normal file
@ -0,0 +1,63 @@
|
||||
# KEZ Mnemonic — canonical test vectors
|
||||
|
||||
These vectors are ground truth that **all three implementations
|
||||
(Rust, Node, Python) MUST match byte-for-byte**. Generated from
|
||||
the Rust and Node implementations, which have already been verified
|
||||
to agree (see `mnemonics` branch commit `0058d9b`).
|
||||
|
||||
## Semantics
|
||||
|
||||
- **24-word phrase** → entropy IS the 32-byte Ed25519 seed (bijection).
|
||||
- **12-word phrase** → 16-byte entropy → 32-byte seed via
|
||||
`SHA-256("kez-bip39-12-v1" || entropy)`.
|
||||
Domain tag bytes: `0x6b, 0x65, 0x7a, 0x2d, 0x62, 0x69, 0x70, 0x33, 0x39, 0x2d, 0x31, 0x32, 0x2d, 0x76, 0x31` (15 bytes, UTF-8 of "kez-bip39-12-v1").
|
||||
|
||||
Wordlist: BIP-39 English (the canonical 2048-word list).
|
||||
|
||||
## Vectors
|
||||
|
||||
### V1 — 24-word, all-zero entropy
|
||||
|
||||
```
|
||||
phrase: abandon abandon abandon abandon abandon abandon abandon abandon
|
||||
abandon abandon abandon abandon abandon abandon abandon abandon
|
||||
abandon abandon abandon abandon abandon abandon abandon art
|
||||
seed: 0000000000000000000000000000000000000000000000000000000000000000
|
||||
pubkey: 3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29
|
||||
```
|
||||
|
||||
### V2 — 12-word, all-zero entropy
|
||||
|
||||
```
|
||||
phrase: abandon abandon abandon abandon abandon abandon abandon abandon
|
||||
abandon abandon abandon about
|
||||
seed: 09451c0f06588db78205e32a793536e15ae263c8f9ee6d14f5c6fd82b8bd20da
|
||||
pubkey: 9403c32e0d3b4ce51105c0bcac09a0d73be0cca98a6bf7b3cd434651be866d70
|
||||
```
|
||||
|
||||
### V3 — 12-word, non-trivial entropy
|
||||
|
||||
```
|
||||
phrase: legal winner thank year wave sausage worth useful legal winner
|
||||
thank yellow
|
||||
seed: 9df434a2bd5dc767ee949d8ab95ca09c4ebbb88cefc3d0b1523f6b2a744ca824
|
||||
pubkey: cc99d06b15ccb83a5ca43f25dd3d27f50638c1c6fbe3a822352da3e07156ce03
|
||||
```
|
||||
|
||||
## What "pubkey" means here
|
||||
|
||||
`pubkey` is the 32-byte Ed25519 public key (hex) derived from the seed
|
||||
above via the standard Ed25519 keypair derivation (the same as
|
||||
`ed25519-dalek` / `@noble/curves/ed25519`). The KEZ identity string is
|
||||
`ed25519:<pubkey>`.
|
||||
|
||||
## Implementation crib
|
||||
|
||||
Both Rust and Node load the **raw entropy** from the BIP-39 phrase
|
||||
(not the BIP-39 PBKDF2-derived 64-byte seed). 24-word entropy is 32
|
||||
bytes and is used directly as the seed. 12-word entropy is 16 bytes
|
||||
and is hashed once with the domain tag to produce the 32-byte seed.
|
||||
|
||||
This deliberately differs from how hardware wallets use the same
|
||||
phrases (which feed the PBKDF2 64-byte seed into BIP-32 derivation).
|
||||
KEZ has one identity per phrase, no derivation tree.
|
||||
@ -23,6 +23,11 @@ from .envelope import (
|
||||
)
|
||||
from .identity import Identity
|
||||
from .keys import Ed25519Secret, NostrSecret, signer_from_flags
|
||||
from .mnemonic import (
|
||||
ed25519_from_mnemonic,
|
||||
generate_ed25519_with_mnemonic,
|
||||
generate_mnemonic,
|
||||
)
|
||||
|
||||
|
||||
def _eprint(msg: str) -> None:
|
||||
@ -44,14 +49,10 @@ def write_or_print(out: str | None, output: str) -> None:
|
||||
|
||||
|
||||
def cmd_identity_new(args: argparse.Namespace) -> int:
|
||||
if args.key_type == "ed25519":
|
||||
secret = Ed25519Secret.generate()
|
||||
print(f"Primary: {secret.identity()}")
|
||||
print(f"Public: {secret.pubkey_hex()}")
|
||||
print(f"Secret: {secret.seed_hex()} (32-byte seed)")
|
||||
print()
|
||||
print("Store the secret somewhere safe. Anyone with the seed can sign as this identity.")
|
||||
else:
|
||||
mnemonic_words = getattr(args, "mnemonic_words", None)
|
||||
if args.key_type == "nostr":
|
||||
if mnemonic_words is not None:
|
||||
raise ValueError("--mnemonic-words is only valid with --key-type ed25519")
|
||||
secret = NostrSecret.generate()
|
||||
print(f"Primary: nostr:{secret.npub()}")
|
||||
print(f"Public: {secret.npub()}")
|
||||
@ -60,12 +61,71 @@ def cmd_identity_new(args: argparse.Namespace) -> int:
|
||||
print("Store the secret somewhere safe. Anyone with the nsec can sign as this identity.")
|
||||
return 0
|
||||
|
||||
# ed25519: default 24 words.
|
||||
words = mnemonic_words if mnemonic_words is not None else 24
|
||||
if words not in (12, 24):
|
||||
raise ValueError(f"mnemonic word count must be 12 or 24, got {words}")
|
||||
secret, phrase = generate_ed25519_with_mnemonic(words)
|
||||
print(f"Primary: {secret.identity()}")
|
||||
print(f"Public: {secret.pubkey_hex()}")
|
||||
print(f"Secret: {secret.seed_hex()} (32-byte seed)")
|
||||
print(f'Mnemonic ({words} words): "{phrase}"')
|
||||
print()
|
||||
if words == 24:
|
||||
print(
|
||||
"The 24-word phrase and the hex seed are equivalent backups —\n"
|
||||
"either restores this identity. Store at least one safely."
|
||||
)
|
||||
else:
|
||||
print(
|
||||
"The 12-word phrase is the canonical backup. The hex seed is\n"
|
||||
"derived from it (one-way) — you can't reconstruct the phrase\n"
|
||||
"from the seed. Store the phrase safely."
|
||||
)
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_identity_mnemonic(args: argparse.Namespace) -> int:
|
||||
words = args.words if args.words is not None else 24
|
||||
if words not in (12, 24):
|
||||
raise ValueError(f"mnemonic word count must be 12 or 24, got {words}")
|
||||
print(generate_mnemonic(words))
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_identity_from_mnemonic(args: argparse.Namespace) -> int:
|
||||
phrase = args.phrase
|
||||
if not phrase or not phrase.strip():
|
||||
raise ValueError("identity from-mnemonic needs the phrase in quotes")
|
||||
secret = ed25519_from_mnemonic(phrase)
|
||||
word_count = len(phrase.split())
|
||||
print(f"Primary: {secret.identity()}")
|
||||
print(f"Public: {secret.pubkey_hex()}")
|
||||
print(f"Secret: {secret.seed_hex()} (32-byte seed)")
|
||||
print(f'Mnemonic ({word_count} words): "{phrase.strip()}"')
|
||||
if word_count == 24:
|
||||
# Confirm canonical round-trip; flag if not.
|
||||
from .mnemonic import mnemonic_from_seed_24
|
||||
|
||||
derived = mnemonic_from_seed_24(bytes.fromhex(secret.seed_hex()))
|
||||
if derived.strip() != phrase.strip():
|
||||
print(f'(note: canonical form is "{derived}")')
|
||||
return 0
|
||||
|
||||
|
||||
# ── claim ─────────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _signer(args: argparse.Namespace):
|
||||
return signer_from_flags(
|
||||
args.nsec,
|
||||
args.ed25519_seed,
|
||||
getattr(args, "mnemonic", None),
|
||||
)
|
||||
|
||||
|
||||
def _build_claim(subject: str, args: argparse.Namespace):
|
||||
signer = signer_from_flags(args.nsec, args.ed25519_seed)
|
||||
signer = _signer(args)
|
||||
primary = signer.identity()
|
||||
payload = new_claim_payload(Identity.parse(subject), primary)
|
||||
return sign_claim(payload, signer)
|
||||
@ -132,12 +192,12 @@ def cmd_verify_id(args: argparse.Namespace) -> int:
|
||||
def _resolve_primary_readonly(args: argparse.Namespace) -> Identity:
|
||||
if getattr(args, "primary", None):
|
||||
return Identity.parse(args.primary)
|
||||
signer = signer_from_flags(args.nsec, args.ed25519_seed)
|
||||
signer = _signer(args)
|
||||
return signer.identity()
|
||||
|
||||
|
||||
def cmd_sigchain_add(args: argparse.Namespace) -> int:
|
||||
signer = signer_from_flags(args.nsec, args.ed25519_seed)
|
||||
signer = _signer(args)
|
||||
primary = signer.identity()
|
||||
chain = sigchain.load_chain(primary)
|
||||
payload = new_add_payload(
|
||||
@ -159,7 +219,7 @@ def cmd_sigchain_add(args: argparse.Namespace) -> int:
|
||||
|
||||
|
||||
def cmd_sigchain_revoke(args: argparse.Namespace) -> int:
|
||||
signer = signer_from_flags(args.nsec, args.ed25519_seed)
|
||||
signer = _signer(args)
|
||||
primary = signer.identity()
|
||||
chain = sigchain.load_chain(primary)
|
||||
payload = new_revoke_payload(
|
||||
@ -217,6 +277,7 @@ def cmd_sigchain_export(args: argparse.Namespace) -> int:
|
||||
def _add_key_flags(p: argparse.ArgumentParser) -> None:
|
||||
p.add_argument("--nsec")
|
||||
p.add_argument("--ed25519-seed", dest="ed25519_seed")
|
||||
p.add_argument("--mnemonic")
|
||||
|
||||
|
||||
def build_parser() -> argparse.ArgumentParser:
|
||||
@ -228,8 +289,27 @@ def build_parser() -> argparse.ArgumentParser:
|
||||
identity_sub = p_identity.add_subparsers(dest="identity_command", required=True)
|
||||
p_new = identity_sub.add_parser("new", help="generate a new identity")
|
||||
p_new.add_argument("--key-type", dest="key_type", choices=["nostr", "ed25519"], default="nostr")
|
||||
p_new.add_argument(
|
||||
"--mnemonic-words",
|
||||
dest="mnemonic_words",
|
||||
type=int,
|
||||
default=None,
|
||||
help="(ed25519 only) generate from a 12- or 24-word BIP-39 phrase",
|
||||
)
|
||||
p_new.set_defaults(func=cmd_identity_new)
|
||||
|
||||
p_mn = identity_sub.add_parser(
|
||||
"mnemonic", help="print a fresh BIP-39 phrase without deriving a key"
|
||||
)
|
||||
p_mn.add_argument("--words", type=int, default=None)
|
||||
p_mn.set_defaults(func=cmd_identity_mnemonic)
|
||||
|
||||
p_fm = identity_sub.add_parser(
|
||||
"from-mnemonic", help="derive an Ed25519 identity from a BIP-39 phrase"
|
||||
)
|
||||
p_fm.add_argument("phrase")
|
||||
p_fm.set_defaults(func=cmd_identity_from_mnemonic)
|
||||
|
||||
# claim
|
||||
p_claim = sub.add_parser("claim", help="create claims")
|
||||
claim_sub = p_claim.add_subparsers(dest="claim_command", required=True)
|
||||
|
||||
@ -87,6 +87,19 @@ class Ed25519Secret:
|
||||
raise ValueError("invalid ed25519 seed: expected 32-byte (64 hex char) seed")
|
||||
return cls(seed)
|
||||
|
||||
@classmethod
|
||||
def from_mnemonic(cls, phrase: str) -> "Ed25519Secret":
|
||||
# Lazy import: mnemonic.py imports Ed25519Secret at module top.
|
||||
from .mnemonic import seed_from_mnemonic
|
||||
|
||||
return cls(seed_from_mnemonic(phrase))
|
||||
|
||||
@classmethod
|
||||
def generate_with_mnemonic(cls, words: int = 24) -> tuple["Ed25519Secret", str]:
|
||||
from .mnemonic import generate_ed25519_with_mnemonic
|
||||
|
||||
return generate_ed25519_with_mnemonic(words)
|
||||
|
||||
def seed_hex(self) -> str:
|
||||
return self._seed.hex()
|
||||
|
||||
@ -132,11 +145,18 @@ def verify_signature(payload, alg: str, key: Identity, sig_hex: str) -> bool:
|
||||
return False
|
||||
|
||||
|
||||
def signer_from_flags(nsec: str | None, ed25519_seed: str | None):
|
||||
if nsec and ed25519_seed:
|
||||
raise ValueError("pass only one of --nsec or --ed25519-seed")
|
||||
def signer_from_flags(
|
||||
nsec: str | None,
|
||||
ed25519_seed: str | None,
|
||||
mnemonic: str | None = None,
|
||||
):
|
||||
provided = [v for v in (nsec, ed25519_seed, mnemonic) if v]
|
||||
if len(provided) > 1:
|
||||
raise ValueError("--nsec, --ed25519-seed, and --mnemonic are mutually exclusive")
|
||||
if nsec:
|
||||
return NostrSecret.from_nsec(nsec)
|
||||
if ed25519_seed:
|
||||
return Ed25519Secret.from_seed_hex(ed25519_seed)
|
||||
raise ValueError("missing key: pass --nsec or --ed25519-seed")
|
||||
if mnemonic:
|
||||
return Ed25519Secret.from_mnemonic(mnemonic)
|
||||
raise ValueError("missing key: pass --nsec, --ed25519-seed, or --mnemonic")
|
||||
|
||||
98
python/kez/mnemonic.py
Normal file
98
python/kez/mnemonic.py
Normal file
@ -0,0 +1,98 @@
|
||||
"""BIP-39 mnemonic phrases for Ed25519 primary keys.
|
||||
|
||||
Mirrors ``rust/crates/kez-core/src/mnemonic.rs`` and
|
||||
``nodejs/packages/kez-core/src/mnemonic.ts`` byte-for-byte.
|
||||
|
||||
Two word counts are supported, with different semantics:
|
||||
|
||||
- **24 words** ↔ **32 bytes of entropy** ↔ **Ed25519 seed** (bijection).
|
||||
Round-trips perfectly. The entropy *is* the seed.
|
||||
|
||||
- **12 words** → **16 bytes of entropy** → **Ed25519 seed**, via
|
||||
``SHA-256("kez-bip39-12-v1" || entropy)``. One-way KEZ-specific
|
||||
derivation; you cannot recover a 12-word phrase from a seed.
|
||||
|
||||
Wordlist: BIP-39 English. NB: we deliberately do *not* use BIP-39's
|
||||
``to_seed(passphrase)`` function — that produces a 64-byte seed via
|
||||
PBKDF2, intended to feed into BIP-32 hierarchical derivation. KEZ has
|
||||
one identity per phrase, so taking the entropy directly (or hashing it
|
||||
once for 12-word phrases) is the right primitive.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import hashlib
|
||||
|
||||
from mnemonic import Mnemonic as _Bip39
|
||||
|
||||
from .keys import Ed25519Secret
|
||||
|
||||
# Domain separator for the 12-word → seed derivation. Bumping this would
|
||||
# break every existing 12-word KEZ identity, so don't.
|
||||
DOMAIN_TAG_12: bytes = b"kez-bip39-12-v1"
|
||||
|
||||
# Lazy singleton of the English BIP-39 wordlist parser.
|
||||
_M = _Bip39("english")
|
||||
|
||||
|
||||
def _assert_words(n: int) -> None:
|
||||
if n not in (12, 24):
|
||||
raise ValueError(f"mnemonic word count must be 12 or 24, got {n}")
|
||||
|
||||
|
||||
def generate_mnemonic(words: int) -> str:
|
||||
"""Generate a fresh BIP-39 mnemonic of the requested length.
|
||||
|
||||
The returned phrase is a space-separated lowercase string from the
|
||||
BIP-39 English wordlist. ``words`` must be 12 or 24.
|
||||
"""
|
||||
_assert_words(words)
|
||||
# bip39 strength is in bits: 12 words = 128 bits, 24 = 256.
|
||||
strength = 256 if words == 24 else 128
|
||||
return _M.generate(strength=strength)
|
||||
|
||||
|
||||
def seed_from_mnemonic(phrase: str) -> bytes:
|
||||
"""Decode a phrase (12 or 24 words) to a 32-byte Ed25519 seed.
|
||||
|
||||
For 24 words the entropy IS the seed; for 12 words the seed is
|
||||
``SHA-256(DOMAIN_TAG_12 || entropy)``.
|
||||
"""
|
||||
trimmed = " ".join(phrase.split())
|
||||
try:
|
||||
entropy = bytes(_M.to_entropy(trimmed))
|
||||
except Exception as exc: # noqa: BLE001 — wrap as our own error
|
||||
raise ValueError(f"invalid mnemonic: {exc}") from exc
|
||||
|
||||
if len(entropy) == 32:
|
||||
return entropy
|
||||
if len(entropy) == 16:
|
||||
return hashlib.sha256(DOMAIN_TAG_12 + entropy).digest()
|
||||
raise ValueError(
|
||||
f"mnemonic must decode to 16 or 32 bytes of entropy, got {len(entropy)}"
|
||||
)
|
||||
|
||||
|
||||
def mnemonic_from_seed_24(seed: bytes) -> str:
|
||||
"""Inverse of :func:`seed_from_mnemonic` for the 24-word case ONLY.
|
||||
|
||||
There is no inverse for 12-word phrases (hashing is one-way) — this
|
||||
function always produces 24 words.
|
||||
"""
|
||||
if len(seed) != 32:
|
||||
raise ValueError(
|
||||
f"mnemonic_from_seed_24: seed must be 32 bytes, got {len(seed)}"
|
||||
)
|
||||
return _M.to_mnemonic(seed)
|
||||
|
||||
|
||||
def ed25519_from_mnemonic(phrase: str) -> Ed25519Secret:
|
||||
"""Reconstruct an :class:`Ed25519Secret` from a BIP-39 phrase."""
|
||||
return Ed25519Secret(seed_from_mnemonic(phrase))
|
||||
|
||||
|
||||
def generate_ed25519_with_mnemonic(words: int) -> tuple[Ed25519Secret, str]:
|
||||
"""Generate a fresh Ed25519 identity *and* return its BIP-39 phrase."""
|
||||
phrase = generate_mnemonic(words)
|
||||
secret = ed25519_from_mnemonic(phrase)
|
||||
return secret, phrase
|
||||
@ -5,6 +5,7 @@ description = "KEZ portable identity graph — Python implementation"
|
||||
requires-python = ">=3.10"
|
||||
dependencies = [
|
||||
"cryptography>=42",
|
||||
"mnemonic>=0.20",
|
||||
"zstandard>=0.22",
|
||||
]
|
||||
|
||||
|
||||
@ -1,2 +1,3 @@
|
||||
cryptography>=42
|
||||
mnemonic>=0.20
|
||||
zstandard>=0.22
|
||||
|
||||
158
python/tests/test_mnemonic.py
Normal file
158
python/tests/test_mnemonic.py
Normal file
@ -0,0 +1,158 @@
|
||||
"""Tests for the BIP-39 mnemonic ↔ Ed25519 seed derivation.
|
||||
|
||||
The three vectors below are ground truth — Rust, Node, and Python MUST
|
||||
all derive these exact seeds and pubkeys. See
|
||||
``python/MNEMONIC-TEST-VECTORS.md``.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from kez.keys import Ed25519Secret
|
||||
from kez.mnemonic import (
|
||||
DOMAIN_TAG_12,
|
||||
ed25519_from_mnemonic,
|
||||
generate_ed25519_with_mnemonic,
|
||||
generate_mnemonic,
|
||||
mnemonic_from_seed_24,
|
||||
seed_from_mnemonic,
|
||||
)
|
||||
|
||||
# ── canonical interop vectors ────────────────────────────────────────────────
|
||||
|
||||
V1_PHRASE = (
|
||||
"abandon abandon abandon abandon abandon abandon abandon abandon "
|
||||
"abandon abandon abandon abandon abandon abandon abandon abandon "
|
||||
"abandon abandon abandon abandon abandon abandon abandon art"
|
||||
)
|
||||
V1_SEED_HEX = "0000000000000000000000000000000000000000000000000000000000000000"
|
||||
V1_PUBKEY_HEX = "3b6a27bcceb6a42d62a3a8d02a6f0d73653215771de243a63ac048a18b59da29"
|
||||
|
||||
V2_PHRASE = (
|
||||
"abandon abandon abandon abandon abandon abandon "
|
||||
"abandon abandon abandon abandon abandon about"
|
||||
)
|
||||
V2_SEED_HEX = "09451c0f06588db78205e32a793536e15ae263c8f9ee6d14f5c6fd82b8bd20da"
|
||||
V2_PUBKEY_HEX = "9403c32e0d3b4ce51105c0bcac09a0d73be0cca98a6bf7b3cd434651be866d70"
|
||||
|
||||
V3_PHRASE = (
|
||||
"legal winner thank year wave sausage worth useful "
|
||||
"legal winner thank yellow"
|
||||
)
|
||||
V3_SEED_HEX = "9df434a2bd5dc767ee949d8ab95ca09c4ebbb88cefc3d0b1523f6b2a744ca824"
|
||||
V3_PUBKEY_HEX = "cc99d06b15ccb83a5ca43f25dd3d27f50638c1c6fbe3a822352da3e07156ce03"
|
||||
|
||||
VECTORS = [
|
||||
pytest.param(V1_PHRASE, V1_SEED_HEX, V1_PUBKEY_HEX, id="v1-24word-zero"),
|
||||
pytest.param(V2_PHRASE, V2_SEED_HEX, V2_PUBKEY_HEX, id="v2-12word-zero"),
|
||||
pytest.param(V3_PHRASE, V3_SEED_HEX, V3_PUBKEY_HEX, id="v3-12word-legal"),
|
||||
]
|
||||
|
||||
|
||||
@pytest.mark.parametrize("phrase, seed_hex, pubkey_hex", VECTORS)
|
||||
def test_vector_seed_matches(phrase: str, seed_hex: str, pubkey_hex: str) -> None:
|
||||
assert seed_from_mnemonic(phrase).hex() == seed_hex
|
||||
|
||||
|
||||
@pytest.mark.parametrize("phrase, seed_hex, pubkey_hex", VECTORS)
|
||||
def test_vector_pubkey_matches(phrase: str, seed_hex: str, pubkey_hex: str) -> None:
|
||||
secret = ed25519_from_mnemonic(phrase)
|
||||
assert secret.pubkey_hex() == pubkey_hex
|
||||
assert secret.seed_hex() == seed_hex
|
||||
|
||||
|
||||
# ── structural properties ───────────────────────────────────────────────────
|
||||
|
||||
|
||||
def test_domain_tag_bytes() -> None:
|
||||
# 15 ASCII bytes — must match the Rust/Node constant exactly.
|
||||
assert DOMAIN_TAG_12 == b"kez-bip39-12-v1"
|
||||
assert len(DOMAIN_TAG_12) == 15
|
||||
|
||||
|
||||
def test_generate_24_round_trips() -> None:
|
||||
phrase = generate_mnemonic(24)
|
||||
assert len(phrase.split()) == 24
|
||||
seed = seed_from_mnemonic(phrase)
|
||||
phrase2 = mnemonic_from_seed_24(seed)
|
||||
assert phrase == phrase2
|
||||
|
||||
|
||||
def test_generate_12_is_deterministic() -> None:
|
||||
phrase = generate_mnemonic(12)
|
||||
assert len(phrase.split()) == 12
|
||||
assert seed_from_mnemonic(phrase) == seed_from_mnemonic(phrase)
|
||||
|
||||
|
||||
def test_mnemonic_from_seed_24_is_inverse() -> None:
|
||||
seed = bytes([42]) * 32
|
||||
phrase = mnemonic_from_seed_24(seed)
|
||||
assert seed_from_mnemonic(phrase) == seed
|
||||
|
||||
|
||||
def test_mnemonic_from_seed_24_rejects_wrong_length() -> None:
|
||||
with pytest.raises(ValueError):
|
||||
mnemonic_from_seed_24(b"\x00" * 16)
|
||||
|
||||
|
||||
def test_invalid_word_count() -> None:
|
||||
with pytest.raises(ValueError):
|
||||
generate_mnemonic(18)
|
||||
with pytest.raises(ValueError):
|
||||
generate_mnemonic(0)
|
||||
|
||||
|
||||
def test_invalid_words_errors_cleanly() -> None:
|
||||
with pytest.raises(ValueError):
|
||||
seed_from_mnemonic("not actually words at all here")
|
||||
|
||||
|
||||
def test_invalid_checksum_errors() -> None:
|
||||
# 12 valid words but wrong checksum.
|
||||
bad = "abandon " * 11 + "abandon"
|
||||
with pytest.raises(ValueError):
|
||||
seed_from_mnemonic(bad.strip())
|
||||
|
||||
|
||||
def test_whitespace_tolerance() -> None:
|
||||
padded = f" {V2_PHRASE} "
|
||||
assert seed_from_mnemonic(padded) == seed_from_mnemonic(V2_PHRASE)
|
||||
# Collapses internal whitespace too.
|
||||
weird = V2_PHRASE.replace(" ", " \t ")
|
||||
assert seed_from_mnemonic(weird) == seed_from_mnemonic(V2_PHRASE)
|
||||
|
||||
|
||||
def test_twelve_and_24_overlapping_entropy_differ() -> None:
|
||||
# Sanity: 12-word entropy left-padded would equal 16 zeros + entropy.
|
||||
# We hash instead — must not collide with the 24-word phrase of the
|
||||
# same 16-byte entropy padded with zeros.
|
||||
from mnemonic import Mnemonic
|
||||
|
||||
m = Mnemonic("english")
|
||||
p12 = m.to_mnemonic(bytes([7]) * 16)
|
||||
p24 = m.to_mnemonic(bytes([7]) * 32)
|
||||
assert seed_from_mnemonic(p12) != seed_from_mnemonic(p24)
|
||||
|
||||
|
||||
# ── Ed25519Secret hooks ─────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def test_ed25519_from_mnemonic_matches_direct_seed() -> None:
|
||||
phrase = mnemonic_from_seed_24(bytes([1]) * 32)
|
||||
from_mn = Ed25519Secret.from_mnemonic(phrase)
|
||||
from_hex = Ed25519Secret.from_seed_hex("01" * 32)
|
||||
assert from_mn.pubkey_hex() == from_hex.pubkey_hex()
|
||||
|
||||
|
||||
def test_generate_with_mnemonic_pair_is_consistent() -> None:
|
||||
secret, phrase = Ed25519Secret.generate_with_mnemonic(24)
|
||||
restored = Ed25519Secret.from_mnemonic(phrase)
|
||||
assert secret.pubkey_hex() == restored.pubkey_hex()
|
||||
|
||||
|
||||
def test_generate_with_mnemonic_12() -> None:
|
||||
secret, phrase = generate_ed25519_with_mnemonic(12)
|
||||
assert len(phrase.split()) == 12
|
||||
restored = ed25519_from_mnemonic(phrase)
|
||||
assert secret.pubkey_hex() == restored.pubkey_hex()
|
||||
Loading…
x
Reference in New Issue
Block a user