THIS FIELD NOTICE IS PROVIDED ON AN "AS IS" BASIS AND DOES NOT IMPLY ANY KIND OF GUARANTEE OR WARRANTY, INCLUDING THE WARRANTY OF MERCHANTABILITY. YOUR USE OF THE INFORMATION ON THE FIELD NOTICE OR MATERIALS LINKED FROM THE FIELD NOTICE IS AT YOUR OWN RISK. CISCO RESERVES THE RIGHT TO CHANGE OR UPDATE THIS FIELD NOTICE AT ANY TIME.
Affected Product Name | Description | Comments |
---|---|---|
APIC-SERVER-L3 | APIC Appliance - Large Config. (> 1200 Edge Ports) | |
APIC-SERVER-L3= | APIC Appliance - Large Configuratio (1200 Edge Ports) SPARE | Part Alternate |
APIC-SERVER-M3 | APIC Appliance - Medium Configuration (Upto 1200 Edge Ports) | |
APIC-SERVER-M3= | APIC Appliance - Medium (Upto 1200 Edge Ports) SPARE | Part Alternate |
N9K-C93108TC-FX | Nexus 9300 with 48p 10G-T, 6p 100G QSFP28 | |
N9K-C93108TC-FX-24 | Nexus 9300-FX w/ 24p 100M/1/10GT & 6p 40/100G | |
N9K-C93108TC-FX24= | Nexus 9300-FX 24x100M/1/10GT & 6x100G,Spare(no Acc/PSU/fan) | |
N9K-C93108TC-FX= | Nexus 9300 with 48p 10G-T, 6p 100G QSFP28 | Part Alternate |
N9K-C93180YC-FX | Nexus 9300 with 48p 1/10/25G, 6p 40/100G, MACsec | Fix On Fail PID |
N9K-C93180YC-FX-24 | Nexus 9300-FX w/24p 1/10/25G & 6p 40/100G | |
N9K-C93180YC-FX24= | Nexus 9300-FX w/24p 1/10/25G & 6p 100G;Spare(no Acc/PSU/fan) | |
N9K-C93180YC-FX3 | Nexus 9300 48p 1/10/25G, 6p 40/100G, MACsec,SyncE | |
N9K-C93180YC-FX3= | Nexus 9300 48p 1/10/25G, 6p 40/100G, MACsec,SyncE | Part Alternate |
N9K-C93180YC-FX3S | Nexus 9300 with 48p 1/10/25G SFP, 6p 40/100G QSFP, SyncE | |
N9K-C93180YC-FX3S= | Nexus 9300 with 48p 1/10/25G SFP, 6p 40/100G QSFP, SyncE | Part Alternate |
N9K-C93180YC-FX= | Nexus 9K w/ 48p 1/10/25G, 6p 40/100G Spare (No Acc/PSU/fan) | Fix On Fail PID |
N9K-C93216TC-FX2 | Nexus 9300 with 96p 10G-T, 12p 100G QSFP, MACsec capable | Fix On Fail PID |
N9K-C93216TC-FX2= | Nexus 9300 with 96p 10G-T, 12p 100G QSFP, MACsec capable | Fix On Fail PID |
N9K-C93240YC-FX2 | Nexus 9300 with 48p 10/25G SFP+ and 12p 100G QSFP28 | |
N9K-C93240YC-FX2= | Nexus 9K fixed,48p 10/25G SFP+12p 100G spare(no PSU/fan/acc) | Part Alternate |
N9K-C93360YC-FX2 | Nexus 9300 w/ 96p 1/10/25G, 12p 100G, MACsec capable | Fix On Fail PID |
N9K-C93360YC-FX2= | Nexus 9300 w/ 96p 1/10/25G,12p 100G,MACsec capable, Spare | Fix On Fail PID |
N9K-C9336C-FX2 | Nexus 9300 Series, 36p 40/100G QSFP28 | |
N9K-C9336C-FX2= | Nexus 9K fixed, 36p 40/100G QSFP28,spare(no fan/psu/acc) | Part Alternate |
N9K-C9348GC-FXP | Nexus 9300 with 48p 100M/1GT, 4p 10/25G & 2p 40/100G QSFP28 | |
N9K-C9348GC-FXP= | Nexus 9K,48x1GT,4x10/25G,2x40/100G Spare(No Acc kit,PS&fan) | Part Alternate |
N9K-C9364C-GX | Nexus 9K ACI & NX-OS Leaf/Spine, 64p 40/100G QSFP28 | |
N9K-C9364C-GX= | Nexus 9K ACI & NX-OS Leaf/Spine, 64p 40/100G QSFP28 | Part Alternate |
Defect ID | Headline |
CSCwb98743 | FN72464: Some DIMMs failing at higher than expected rate |
A limited number of Dual In-line Memory Modules (DIMMs) shipped from Cisco are impacted by a known deviation in the memory supplier's manufacturing process. This deviation can result in a higher rate of failure.
In Revision 1.2 of this field notice, the Application Policy Infrastructure Controller (APIC) products were moved from Fix on Fail to Proactive DIMM replacement.
It is required to replace the DIMMS and update from the older Cisco Integrated Management Controller (CIMC) BIOS (Version 4.1(3c) or earlier) in the same maintenance window.
DIMM manufacturers compose their DIMMs of multiple memory modules to reach the desired capacity. In this case, a manufacturing deviation in specific modules impacts 16GB DIMMs. This deviation was contained to a specific date range, and the DIMMs which use these chips were manufactured during the middle to end of 2020.
Since the discovery of this deviation, additional limits have been imposed on the manufacturing process to help prevent future DIMMs from experiencing this process variation.
Most DIMMs with this manufacturing deviation will exhibit persistent correctable memory errors. If left untreated, the DIMMs can eventually encounter an uncorrectable memory event. If encountered during runtime, uncorrectable errors will cause an unexpected switch reset.
Various DIMM Reliability, Availability, and Serviceability (RAS) features or even operating system features can mask the extent of these correctable errors. It is recommended to check your DIMMs for exposure using the Serial Number Validation Tool described in the Serial Number Validation section of this field notice. Only specific DIMMs are impacted by this issue.
Customers should replace the hardware DIMM to avoid the potential for unexpected switch/server failure. For information about requesting a replacement, see the Upgrade Program Information section of this field notice after validating the Serial Number(s) as described in the How To Identify Affected Products section.
All Serial Numbers are the Switch or Server Serial Number, not the DIMM Serial Number.
DIMM Replacement
Cisco is offering field services free of charge for DIMM replacement through a qualified Cisco 3rd party Field Engineer.
After the replacement DIMMs have arrived onsite and you are ready to schedule the replacement(s), send an email to ciscodimmswap@centricsit.com to engage the field services team.
Impacted DIMMs can be identified based on the Product ID (PID) and their serial number. You will need to use the Serial Number Validation Tool described in the Serial Number Validation section of this field notice to identify impacted product.
Note: Cisco recommends notifying our engineers for onsite service to schedule repairs or replace the device. See the Additional Information section.
Switch Logs That Contain Memory Errors
When running NX-OS Standalone, a unit that experiences this issue can show these messages in the syslogs on the device:
%DAEMON-3-SYSTEM_MSG: Location: SOCKET:0 CHANNEL:? DIMM:? [] - mcelog
%DEVICE_TEST-3-MCE_24HR_FAIL: Module 1 has exceeded MCE 24 hour correctable threshold of 100 with #### correctable errors within 24 hours.
%DAEMON-3-SYSTEM_MSG: corrected Socket memory error count exceeded threshold: #### in 24h - mcelog
or
%DAEMON-3-SYSTEM_MSG: MESSAGE : corrected DIMM memory error count exceeded threshold: #### in 24h - mcelog
%DAEMON-3-SYSTEM_MSG: MESSAGE_Location: /var/log/mcelog - mcelog
Note: Syslogs referencing the count of errors will be generated whenever 100 errors are corrected and the cumulative count of errors will be printed.
These errors indicate that correctable errors are being generated, which should not impact switch performance. If errors continue, an uncorrectable error can be experienced, and the device will undergo a kernel panic.
Switch Special Notes
This field notice is to replace memory on site, and it is important to note that if your impacted switch/server has failed or has memory errors and is degraded, use the standard RMA replacement.
Onsite replacements are usually done during maintenance windows and typically scheduled.
Some switches are originally assembled with 24GB of memory, this is one 16GB and one 8GB DIMM. Because of this, we send both DIMMs on those devices. When the switch cover is removed, both DIMMs are replaced even though the 8GB is known to be extremely reliable. This is a proactive action decision made by the Cisco team.
“Cisco highly recommends that you take advantage of the field service offering of a field engineer. Please note that if you decided to change the DIMMs on your own there are some switch models that have a more extensive process to gain access to the DIMM location."
It is highly recommended for a Cisco Field Engineer to complete the replacement for switches with high screw counts.
You can view a list of the switch module and the number of screws involved in this table.
Switch Model (PID) | DIMM Access Method | Number of Screws |
---|---|---|
N9K-C93180YC-FX3S | DIMM Door Access | 6 |
N9K-C93240YC-FX2 | DIMM Door Access | 6 |
N9K-C9364C-GX | DIMM Door Access | 6 |
N9K-C93180YC-FX-24* | DIMM Door Access | 6 |
N9K-C93180YC-FX3 | DIMM Door Access | 6 |
N9K-C93108TC-FX3P | DIMM Door Access | 5 |
N9K-C9336C-FX2 | Top Cover Access | 37 |
N9K-C9348GC-FXP | Top Cover Access | 35 |
N9K-C93108TC-FX | Top Cover Access | 33 |
N9K-C93108TC-FX-24* | Top Cover Access | 33 |
Additionally, switches designed with 8GB low-density DIMM memory are impacted, but the failure rate is extremely low. These products are fixed upon failure.
8GB and Supervisor Product IDs |
---|
N9K SUP-A+ |
N9K SUP-A+= |
N9K-C93360YC-FX2 |
N9K-C93360YC-FX2= |
N9K-C93216TC-FX2 |
N9K-C93216TC-FX2= |
N9K-C93180YC-FX |
N9K-C93180YC-FX= |
APIC Server Special Notes
The CIMC BIOS issue is noted in UCS field notice FN72272. This BIOS issue will show higher EC errors counts that are potentially higher than the actual EC error count. You can see Uncorrectable errors because of older BIOS.
Cisco provides a tool to verify whether a device is impacted by this issue. To check the device, enter the serial number in the Serial Number Validation Tool.
Important: For security reasons, you must click the Serial Number Validation Tool link that is provided in this section. Do not copy and paste the link into a browser. Use of the Serial Number Validation Tool URL external to this field notice will fail.
Support Case Manager must be used for ordering replacement parts for this Field Notice. To open Support Case Manager in a new tab, click the following link:
https://mycase.cloudapps.cisco.com/fieldnotice?fn=FN72464
Provide the following information:
Order entry supports up to 50 serial numbers per request. For more than 50, submit additional requests.
Version | Description | Section | Date |
1.5 | Updated the Upgrade Program Information to use Support Case Manager (SCM). | Upgrade Program Information | 2023-OCT-13 |
1.4 | Updated the Upgrade Program Section. | — | 2023-AUG-24 |
1.3 | Updated the How To Identify Affected Products Section. | — | 2023-JUN-23 |
1.2 | Updated the Products Affected, Problem Description, Workaround/Solution, and Additional Information Sections. | — | 2022-DEC-13 |
1.1 | Updated the Background, Workaround/Solution, and Additional Information Sections. | — | 2022-NOV-14 |
1.0 | Initial Release | — | 2022-OCT-13 |
For further assistance or for more information about this field notice, contact the Cisco Technical Assistance Center (TAC) using one of the following methods:
To receive email updates about Field Notices (reliability and safety issues), Security Advisories (network security issues), and end-of-life announcements for specific Cisco products, set up a profile in My Notifications