Last 12 weeks · 265 commits
6 of 6 standards met
Issue Description I test the main branch of OpenR with the topo where a router has 12 neighbor OpenR routers, but the log shows errors for Spark: The log reports that both SparkHelloMsg and HeartBeatMsg failed to be sent. However, when the number of neighbors is small (i.e., 4 neighbors), there are no such errors. Some of the logs
Summary This PR adds a comprehensive guide to — a section that currently has no troubleshooting reference for OSS operators. Problem The OpenR Operator Guide covers configuration, CLI usage, and route representation, but has no troubleshooting page. New operators and contributors hit well-known runtime issues (Spark send failures, build errors, jemalloc warnings, Python CLI crashes) and have to search through old GitHub issues to find answers. Many of these issues are documented in closed and open issues (#72, #100, #134, #135, #136, #137, #147, #163) but are not surfaced in any structured way in the docs. What This PR Adds A new file covering 7 common issues with symptoms, root cause analysis, diagnostic commands, and fixes/workarounds: Also includes a Section 8: Collecting Diagnostic Information with a ready-to-paste bash snippet for bug reports — this alone will improve the quality of new issue filings. Why This Matters The OpenR issue tracker has many reports of the same errors being hit by different operators independently. A canonical troubleshooting reference: Reduces duplicate issues Helps operators self-serve before filing bugs Gives contributors a starting point for understanding known failure modes Makes the OSS project more approachable for new adopters Checklist [x] Docs only — no code changes [x] New file in existing section [x] All issues cross-linked with GitHub issue numbers [x] Each section has: Symptom, Cause, Diagnostics, Fix/Workaround [x] Diagnostic collection section included for bug reporters [x] Written for Ubuntu 20.04/22.04 OSS deployment context
Bug Report OpenR version: (latest OSS build) OS: Ubuntu 22.04 LTS (x86_64) Python: 3.10.12 Summary Running on an OpenR instance configured with area-aware link metrics causes a crash in the Python CLI. The crash happens because the Thrift struct for links in non-default areas omits the top-level field when are active, but the display code unconditionally accesses without checking for its presence. This is a silent regression — it only surfaces in multi-area deployments, which are increasingly common in data-center spine/leaf topologies. Steps to Reproduce 1. Configure OpenR with multiple areas using in : 2. Bring up OpenR with at least one link in each area. 3. Run: Result: Expected result: A table showing all links with their per-area metrics (or where metric is unset). Root Cause In area-aware mode, links in non-default areas use from the area policy instead of populating the top-level field in the Thrift struct. The Python CLI code in calls unconditionally: The fix is straightforward: use or check before access. Proposed Fix Alternatively, the display code should be updated to read from when is absent, showing the effective metric for each area. Impact This bug silently breaks the primary CLI diagnostic tool () for any operator running OpenR in a multi-area configuration. Since multi-area is the recommended deployment model for large-scale fabrics, this is a high-impact issue for production operators who rely on for day-to-day troubleshooting. Environment OpenR: Build method: from main branch Python: 3.10.12 Config: multi-area with OS: Ubuntu 22.04 Related: #72 (Python module issues in ), #134 (other CLI runtime errors)
Summary: The Open/R OSS Docker build () s the repo into and runs . But the getdeps invocation in was missing , so getdeps cloned the canonical (from the manifest ) at a pinned revision and built that instead of the ed source. The runtime banner gave it away: , — i.e. local changes baked into the image were silently ignored. Add to the command so getdeps builds the Open/R sources in the checkout. This matches what the same script's line already does () and what the GitHub Actions workflow () does (). Dependencies are still fetched/pinned by getdeps as before; only the top-level Open/R project now comes from the local tree. Differential Revision: D108239680
Summary: The openr GitHub CI fetches mvfst, fizz, and fb303 at HEAD because their hash files are not synced by CodeSync (D107286362 fixes this long-term). When folly's pinned version was updated to a commit that removed , the unpinned mvfst HEAD still referenced it, breaking the build. This adds a workflow step that writes files for the three missing deps using hashes from the same monorepo snapshot as the current folly/fbthrift/wangle pins. The step is a no-op if the files already exist (i.e., once CodeSync starts syncing them via D107286362). Differential Revision: D107302581
Summary: Snappy >= 1.1.9 disables RTTI by default via in its CMakeLists.txt (google/snappy#184). This causes when linking openr binaries, because folly's compression module (compiled with RTTI) references snappy's typeinfo which the RTTI-stripped doesn't export. The previous approach of passing via does not work because snappy's CMakeLists.txt explicitly strips any existing before appending . Fix: add a workflow step after fetching snappy that patches its CMakeLists.txt to remove , restoring the GCC default (RTTI enabled). This is the same approach used by Fedora, Debian, and openSUSE to work around google/snappy#184. Reviewed By: mloo3 Differential Revision: D107163112
Repository: facebook/openr. Description: Distributed platform for building autonomic network functions. Stars: 935, Forks: 251. Primary language: C++. Languages: C++ (73.9%), Python (19.5%), Thrift (3%), CMake (2.9%), Shell (0.4%). License: MIT. Latest release: rc-20191208-10906 (6y ago). Open PRs: 11, open issues: 11. Last activity: 53m ago. Community health: 87%. Top contributors: saifhhasan, xiangxu1121, jstrizich, wez, ahornby, mloo3, r-barnes, yi-xian, cooperlees, simpkins and others.