GitShow/facebook/fboss
facebook

fboss

Facebook Open Switching System Software for controlling network switches.

by facebook
Star on GitHubFork

C++

976 stars399 forks713 contributorsActive · 3h agoSince 2015

Meet the team

See all 713 on GitHub →
phshaikh
phshaikh1.1k contributions
nivinlawrence
nivinlawrence1.0k contributions
shri-khare
shri-khare978 contributions
daiwei1983
daiwei1983745 contributions
srikrishnagopu
srikrishnagopu660 contributions
somasun
somasun627 contributions
ravi861
ravi861409 contributions
Scott8440
Scott8440383 contributions

Languages

View on GitHub →
C++95.8%
Python2.4%
Starlark0.6%
Thrift0.5%
CMake0.4%
MDX0.2%
Other0.1%

Commit activity

Last 12 weeks · 2101 commits

Full graph →

Community health

5 of 6 standards met

Community profile →
87
✓README✓License✓Contributing✓Code of Conduct○Issue Template✓PR Template

Recent PRs & issues

Active · Last activity 3h ago
See all on GitHub →
cel-gl
[Celestica] Ladakh800bcls: Agent Test: Fix AgentTrunkLoadBalancerTests no used vlan create route interface fail in warmbootOpenPR

Pre-submission checklist** [x] I've ran the linters locally and fixed lint errors related to the files I modified in this PR. You can install the linters by running [ ] clang-format.............................................................Passed shellcheck...........................................(no files to check)Skipped shfmt................................................(no files to check)Skipped trim trailing whitespace.................................................Passed fix end of files.........................................................Passed check yaml...........................................(no files to check)Skipped check json...........................................(no files to check)Skipped check for merge conflicts................................................Passed ruff check...........................................(no files to check)Skipped ruff format..........................................(no files to check)Skipped Prevent sai_impl in fboss manifest.......................................Passed Summary Issue Summary On multi-NPU hardware platforms, the AgentTrunkLoadBalancerTest suite was experiencing a fatal warmboot failure, The failure manifested as a no sai vlan for VlanID 2001 error The root cause was an inconsistency between the agent's software SwitchState and the actual hardware state programmed in the SAI layer after a cold boot, which then became fatal upon warmboot. Root Cause Analysis 1. Initial Config Creates Redundant VLANs: The test's setup phase begins by calling initialConfig(), which uses utility::onePortPerInterfaceConfig. This utility creates a clean L3 topology by assigning a unique VLAN and Router Interface to every single physical port on the switch (e.g., Port 1 → VLAN 2000, Port 2 → VLAN 2001, etc.). 2. Aggregate Port Creation Orphans VLANs: Immediately after, the configureAggregatePorts helper is called, which bundles a subset of these physical ports into Link Aggregation Groups (LAGs). For example, it might place Ports 1, 2, and 3 into agg1. As part of this, it correctly re-assigns all member ports to a single VLAN (e.g., VLAN 2000). 3. State Mismatch: The critical issue was that this function only updated the port-to-VLAN mappings but did not remove the now-empty VLANs and Interfaces (e.g., VLAN 2001, VLAN 2002, and their corresponding L3 Interfaces) from the main configuration lists (vlans and interfaces). 4. Warmboot Failure: During the initial cold boot, applyNewConfig would see these orphaned but still present VLAN/Interface objects in the configuration and program them into the hardware. However, because these VLANs had no member ports on npu0, some SAI implementations would garbage-collect or simply not retain the empty VLAN in hardware. Upon warm boot, the agent would reload its software state (which still contained the "ghost" InterfaceID 2001) and attempt to re-initialize it. When SaiRouterInterfaceManager tried to create the router interface, it would query the SAI layer for the handle of its underlying VLAN (VlanID 2001). Since the hardware for this empty VLAN no longer existed on npu0, the call would fail, leading to the fatal crash. Solution Implemented The solution was to make the configuration declaratively correct by explicitly pruning any orphaned VLANs and interfaces before applying the config. This was achieved by modifying the configureAggregatePorts function in fboss/agent/test/agent_hw_tests/AgentTrunkLoadBalancerTests.cpp: 1. Identify Active VLANs: After the aggregate ports are created, the code now iterates through all vlanPorts in the configuration to build a set of activeVlans—VLANs that still have at least one physical or aggregate port member. 2. Prune Unused VLANs and Interfaces: The code uses std::remove_if to filter both the vlans and interfaces lists within the cfg::SwitchConfig object. Any VLAN or interface whose ID is not in the activeVlans set is removed. This ensures that the config object passed to applyNewConfig is a precise representation of the desired final state, without any orphaned entries. As a result, the SwitchState delta calculation is now correct, and the agent properly removes the unused VLANs from the hardware during the initial setup, preventing the warmboot failure. Test Plan test on npu0 and npu1 cold and warm boot === Cold boot: AgentTrunkLoadBalancerTest.ECMPFullTrunkHalfHash4X3WideTrunksCpuTraffic (switch_id=0) === Waiting for config files... Config files ready after 1s Starting hw_agent processes... === Warm boot: AgentTrunkLoadBalancerTest.ECMPFullTrunkHalfHash4X3WideTrunksCpuTraffic (switch_id=0) === === Test Results === Cold boot: PASSED (exit code 0) Warm boot: PASSED (exit code 0) === Cold boot: AgentTrunkLoadBalancerTest.ECMPFullTrunkHalfHash4X3WideTrunksCpuTraffic (switch_id=1) === Waiting for config files... Config files ready after 1s Starting hw_agent processes... === Warm boot: AgentTrunkLoadBalancerTest.ECMPFullTrunkHalfHash4X3WideTrunksCpuTraffic (switch_id=1) === === Test Results === Cold boot: PASSED (exit code 0) Warm boot: PASSED (exit code 0) AI skill code check

cel-gl · 1h ago
hillol-nexthop
[Nexthop][fboss2-dev] CLI for config and delete ACL rule and action subcommandsOpenPR

Summary Full ACL rule CLI surface in — match fields, actions, and whole-rule delete: upserts — if the named rule does not exist in the table it is created with the supplied attribute, otherwise the attribute is applied to the existing entry. removes the entire (and its ) from the table. Match-field attributes 16 setters on : , , , , , , , , , , , , , , , . The 5-token form is unique to the setter; without `0xFFpermitdenyAclEntry.actionTypeMatchActiondataPlaneTrafficPolicy.matchToActionMatchToActionsw.aclTableGroupsaclTableGroupprocessAclTableGroupDeltaSaiAclTableManagerSaiSwitchfboss2_cmd_config_test --gtest_filter='AclRule'CmdConfigAclRuleTestFixtureargValidation_badArityargValidation_unknownAttrargValidation_extraTokenOnlyForTtlargValidation_outOfRangeargValidation_badIpargValidation_badMacargValidation_protocolKeywordargValidation_ipFragmentEnumargValidation_etherTypeNumericOrNameargValidation_unknownActionSubattrargValidation_actionPermitNoExtraargValidation_actionRequiresValueargValidation_actionRangeChecksargValidation_actionRedirectShaperuleAutoCreatedtableNotFoundsetSourceIpsetDestinationIpsetProtocolsetSourcePortsetDestinationPortsetDscpsetTcpFlagssetIcmpTypesetIcmpCodesetIpFragmentsetTtlDefaultMasksetTtlExplicitMasksetDestinationMacsetEtherTypesetVlansetIpTypesetPacketLookupResultsetActionPermitsetActionDenysetActionSendToQueuesetActionSetDscpsetActionSetTcsetActionMirrorIngresssetActionMirrorEgresssetActionCountersetActionTrapToCpusetActionCopyToCpusetActionRedirectNexthopCmdDeleteAclRuleTestFixtureargValidation_badArityargValidation_validAritytableNotFoundruleNotFounddeleteRulefboss2_integration_test --gtest_filter='AclRule'SetUpTestSuite/etc/coop/agent.confqualifiersfboss_sw_agentTearDownTestSuiteconfig acl ruleTearDowndelete acl ruleSetSourceIpSetDestinationIpSetProtocolSetSourcePortSetDestinationPortSetDscpSetTcpFlagsSetIcmpTypeSetIcmpCodeSetIpFragmentSetTtlDefaultMaskSetTtlExplicitMaskSetDestinationMacSetEtherTypeSetVlanSetIpTypeSetPacketLookupResultSetActionPermitSetActionDenySetActionSendToQueueSetActionSetDscpSetActionSetTcSetActionTrapToCpuSetActionCopyToCpuDeleteRuleconfig acl ruledelete acl ruleaction mirror-ingressaction mirror-egressaction counteraction redirect nexthop` are intentionally unit-test-only — SAI rejects entries that reference an undefined mirror session, unprovisioned counter, or unresolvable nexthop, and provisioning that supporting state on the DUT is out of scope for this PR. The CLI is verified to construct the right config delta in unit tests.

hillol-nexthop · 1h ago
vybhav-nexthop
[Nexthop][fboss2-dev] Add config admin-distance CLI commandOpenPR

Pre-submission checklist* [X] I've ran the linters locally and fixed lint errors related to the files I modified in this PR. You can install the linters by running [X] Summary Adds a new write command: fboss2-dev config switch admin-distance Sets the administrative distance for a routing client in the agent running config via a COLDBOOT config session. What New command CmdConfigAdminDistance with AdminDistanceArg validating two positional args: client-id (non-negative integer) and distance (0-255) Read-modify-write on clientIdToAdminDistance map - all other client entries survive untouched No-op detection - if the value is already set, returns early without saving Registered under as a switch-level setting Blocks client-ids 1/2/3/4 (STATIC_ROUTE, INTERFACE_ROUTE, LINKLOCAL_ROUTE, REMOTE_INTERFACE_ROUTE) whose admin distances are hardcoded in the agent and cannot be changed via config Wired into CMake (fboss2_config_lib, fboss2_cmd_config_test, fboss2_integration_test) and BUCK Test Plan Unit tests — //fboss/cli/fboss2/test/config:cmd_config_test [----------] 4 tests from CmdConfigAdminDistanceTestFixture [ RUN ] CmdConfigAdminDistanceTestFixture.argValidation [ OK ] CmdConfigAdminDistanceTestFixture.argValidation (85 ms) [ RUN ] CmdConfigAdminDistanceTestFixture.updateExistingClient [ OK ] CmdConfigAdminDistanceTestFixture.updateExistingClient (95 ms) [ RUN ] CmdConfigAdminDistanceTestFixture.addNewClient [ OK ] CmdConfigAdminDistanceTestFixture.addNewClient (76 ms) [ RUN ] CmdConfigAdminDistanceTestFixture.alreadySet [ OK ] CmdConfigAdminDistanceTestFixture.alreadySet (60 ms) [----------] 4 tests from CmdConfigAdminDistanceTestFixture (318 ms total) [ PASSED ] 260 tests. //fboss/cli/fboss2/test/config:cmd_config_test PASSED in 16.4s Integration test — ConfigAdminDistanceTest.SetAndRestoreAdminDistance on NH-4010-F Note: Google Test filter = ConfigAdminDistance [==========] Running 1 test from 1 test suite. [----------] 1 test from ConfigAdminDistanceTest [ RUN ] ConfigAdminDistanceTest.SetAndRestoreAdminDistance I0630 18:31:30] Fboss2IntegrationTest::SetUp - starting CLI test I0630 18:31:30] Agent already ready; skipping wait I0630 18:31:30] [Step 1] Reading current admin distances... I0630 18:31:30] client-id 0 -> 20 I0630 18:31:30] [Step 2] Setting client-id 0 to 42... I0630 18:31:30] Running CLI command: config switch admin-distance 0 42 I0630 18:31:30] Running CLI command: config session commit I0630 18:31:30] [Step 3] Restoring client-id 0 to 20... I0630 18:31:30] Running CLI command: config switch admin-distance 0 20 I0630 18:31:30] Running CLI command: config session commit I0630 18:31:30] TEST PASSED [ OK ] ConfigAdminDistanceTest.SetAndRestoreAdminDistance (123 ms) [----------] 1 test from ConfigAdminDistanceTest (123 ms total) [==========] 1 test from 1 test suite ran. (123 ms total) [ PASSED ] 1 test.

vybhav-nexthop · 2h ago

Recent fixes

View closed PRs →
ezongCLS
[Celestica] Ladakh800bcls: Decouple RTM from MDIO, Add RTM CTRL Config Function, and Add RTM config into platform_managerMergedPR

Pre-submission checklist** [x] I've ran the linters locally and fixed lint errors related to the files I modified in this PR. You can install the linters by running [x] Summary Based on the modification of BSP MDIO driver, remove the unused from the MDIO device initialisation in . Add the functions to support the logic of RTM CTRL creation. Add the RTM CTRL conifg into . Update the corresponding of in . Motivation A recent PR in has modified the creation of MDIO controller and added exclusive driver for retimer controller IP block. This PR is to modify the fboss logic for the two device to match the latest driver. Test Plan Tested on Anacapa 25% (Ladakh800b): creating the RTM CTRL devices: devices in : after modification (removed ):

ezongCLS · 8h ago
Structured data for AI agents

Repository: facebook/fboss. Description: Facebook Open Switching System Software for controlling network switches. Stars: 976, Forks: 399. Primary language: C++. Languages: C++ (95.8%), Python (2.4%), Starlark (0.6%), Thrift (0.5%), CMake (0.4%). Open PRs: 100, open issues: 73. Last activity: 3h ago. Community health: 87%. Top contributors: phshaikh, nivinlawrence, shri-khare, daiwei1983, srikrishnagopu, somasun, ravi861, Scott8440, priyankw-meta, shiva-menta and others.

·@ofershap

Replace github.com with gitshow.dev