Skip to content

feat(biz): add wx biz-articles command to query public account messages#33

Open
ChenyqThu wants to merge 3 commits into
jackwener:mainfrom
ChenyqThu:feat/biz-articles
Open

feat(biz): add wx biz-articles command to query public account messages#33
ChenyqThu wants to merge 3 commits into
jackwener:mainfrom
ChenyqThu:feat/biz-articles

Conversation

@ChenyqThu
Copy link
Copy Markdown

feat(biz): add wx biz-articles command to query public account messages

Summary

Adds a new biz-articles subcommand that queries locally cached WeChat public account (公众号) article pushes from biz_message_0.db.

This enables a downstream workflow for downloading full article content:

wx biz-articles --since today --json | jq '.[].url' | xargs opencli weixin download

Background

  • WeChat stores public account (官方账号) message pushes in a separate database: message/biz_message_0.db (SQLCipher 4 encrypted)
  • This DB was not exposed by any existing wx-cli command
  • The encryption key is already scanned and stored in ~/.wx-cli/all_keys.json by wx init
  • Each public account has its own Msg_{md5(username)} table, following the same convention as message_0.db
  • Message content is zstd-compressed XML containing <mmreader>/<item> structures with article metadata

New CLI Interface

# Last 50 articles (default)
wx biz-articles

# More articles
wx biz-articles -n 200

# Filter by public account name (fuzzy match on display name)
wx biz-articles --account "返朴"
wx biz-articles --account "Datawhale"

# Time filter (article publish time, YYYY-MM-DD)
wx biz-articles --since 2026-05-10
wx biz-articles --since 2026-05-01 --until 2026-05-10

# Show only accounts with unread messages, one latest article per account
wx biz-articles --unread
wx biz-articles --unread --account "Datawhale"   # combine: unread within specific account

# JSON output (for downstream piping)
wx biz-articles --json
wx biz-articles --since 2026-05-10 --json | jq '.[].url'

Output Fields

Each article item includes:

Field Description
time Article publish time (formatted)
timestamp Article publish timestamp (seconds)
recv_time Message receive time (when WeChat pushed it)
recv_time_str Message receive time (formatted)
account Public account display name
account_username Public account username (gh_*)
title Article title
url Article URL (mp.weixin.qq.com link)
digest Article summary/excerpt
cover_url Cover image URL

Implementation Notes

  • biz_message_0.db is loaded on-demand via existing DbCache mechanism (no startup cost unless biz-articles is called)
  • The key for message/biz_message_0.db is already in all_keys.json, no changes to wx init needed
  • Multi-article pushes (图文消息) are expanded: each <item> in <mmreader> becomes a separate output row
  • Items without URL or title (e.g., payment notifications from service accounts) are filtered out
  • New extract_cdata helper function strips CDATA wrappers from XML content
  • Results sorted by pub_time DESC (article publish time, not message receive time)

--unread semantics

  • Queries session.db for unread_count > 0 rows whose chat_type == official_account, intersects with --account filter if both provided
  • Returns at most one latest article per account (dedupe by account_username after the global pub_time DESC sort)
  • Aligns with the behavior of wx unread --filter official for fast "what unread accounts are there + what's the latest title" scanning
  • Empty intersection short-circuits before scanning biz tables

Changes

  • src/ipc.rs: Add BizArticles IPC request variant
  • src/cli/biz_articles.rs: New CLI command handler (follows sns_feed pattern)
  • src/cli/mod.rs: Register BizArticles subcommand in clap + dispatch
  • src/daemon/query.rs: Add q_biz_articles query + parse_biz_xml_items + extract_cdata helpers + 8 unit tests
  • src/daemon/server.rs: Add dispatch case for BizArticles

Test Results

test result: ok. 49 passed; 0 failed; 0 ignored

New tests (8):

  • biz_tests::extract_cdata_normal
  • biz_tests::extract_cdata_empty
  • biz_tests::extract_cdata_url
  • biz_tests::extract_cdata_no_cdata_wrapper
  • biz_tests::parse_biz_xml_items_single_article
  • biz_tests::parse_biz_xml_items_skips_no_url
  • biz_tests::parse_biz_xml_items_multi_article
  • biz_tests::parse_biz_xml_items_pub_time_fallback

Verified Output (real WeChat install with ~30 public accounts, 2026-05-10)

- account: <Account A>
  title: <Sample tech article title>
  url: http://mp.weixin.qq.com/s?__biz=<redacted>&mid=<id>&idx=1&sn=<hash>
  digest: <Article excerpt>
  cover_url: https://mmbiz.qpic.cn/<redacted>
  time: 2026-05-10 17:01
  recv_time_str: 2026-05-10 17:06

- account: <Account B>
  title: <Another article>
  url: http://mp.weixin.qq.com/s?__biz=<redacted>&mid=<id>&idx=1&sn=<hash>

ChenyqThu added 3 commits May 10, 2026 20:49
加载 biz_message_0.db 提取公众号推送(标题/url/作者/时间)。

- daemon 端通过 DbCache 按需解密 biz_message_0.db(密钥已在 all_keys.json 中)
- 新增 IPC 变体 BizArticles(limit/account/since/until 参数)
- 新增 query 处理器 q_biz_articles:
  - 通过 Name2Id 反查 gh_* username → md5 → Msg_<hash> 表映射
  - 过滤 local_type & 0xFFFFFFFF = 49(appmsg 公众号文章)
  - zstd 解压 + extract_cdata 解析 <mmreader>/<item> XML
  - 支持多文章推送(一条消息含多篇文章)
  - 输出字段:time/timestamp/recv_time/account/account_username/title/url/digest/cover_url
- 新增 CLI 子命令 wx biz-articles,参数:-n / --account / --since / --until / --json
- 新增工具函数 extract_cdata(CDATA 块解析)和 parse_biz_xml_items
- 新增 8 个单测(biz_tests 模块)覆盖 CDATA 解析和多文章场景

支持工作流:
  wx biz-articles --since today --json | jq ".[].url" | xargs opencli weixin download

Verified: 返朴 ADHD 文章、Datawhale Claude Code 文章、土猛员外知识引擎文章均已正确提取。
只列「有未读的公众号」的最近 1 篇文章 — 与 'wx unread --filter official'
行为一致,便于扫描"哪些公众号还有未读,标题是啥"。

- ipc.rs: BizArticles 加 unread: bool 字段(serde default = false 向后兼容)
- cli/mod.rs: --unread flag
- cli/biz_articles.rs: 透传 unread
- daemon/server.rs: dispatch 加 unread 参数
- daemon/query.rs: q_biz_articles
  - 开启 --unread 时先查 session.db 拿 unread_count>0 且
    chat_type==official_account 的 username 集合
  - 与 --account 取交集(两者都给时进一步缩小范围)
  - 空交集提前 return,避免无意义全表扫
  - 解析后按 pub_time DESC 排,每个 account_username 只保留首条
  - 最后再 truncate(limit)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant