Sending a Diagnostic Interrupt
This feature is for advanced users. Sending a diagnostic interrupt to a live system can cause data corruption or system failure.
You can send a diagnostic interrupt to troubleshoot an unresponsive or unreachable compute virtual machine (VM) instance.
A diagnostic interrupt causes the instance's OS to crash and reboot. Before you send a diagnostic interrupt, you must configure the OS to generate a crash dump (also called a memory dump file) when it crashes. The crash dump captures information about the state of the OS at the time of the crash. After the OS restarts, you can analyze the crash dump to identify and debug the issue.
Required IAM Policy
To use Oracle Cloud Infrastructure, you must be granted security access in a policy by an administrator. This access is required whether you're using the Console or the REST API with an SDK, CLI, or other tool. If you get a message that you don't have permission or are unauthorized, verify with your administrator what type of access you have and which compartment to work in.
For administrators: The policy in Let users launch compute instances includes the ability to send a diagnostic interrupt to an instance. If the specified group doesn't need to launch instances or attach volumes, you could simplify that policy to include only manage instance-family
, and remove the statements involving volume-family
and virtual-network-family
.
Before You Begin
- The instance's OS must be configured to generate a crash dump file.
- The instance must be in the Running state. For more information, see Stopping, Starting, or Restarting an Instance.
- There are no in-progress actions affecting the instance, such as block volumes or secondary VNICs in the process of being attached or detached.
Configuring the OS to Generate a Crash Dump
Before you send a diagnostic interrupt to an instance, you must configure the OS to generate a crash dump when it crashes. The diagnostic interrupt is received as a non-maskable interrupt (NMI) on the target instance.
The steps depend on the OS.
Linux
On Oracle Linux platform images, the OS is either fully configured or partially configured to generate a crash dump, depending on the image release date.
- Images released in August 2020 or later: The image is fully configured to generate a crash dump.
- Earlier images: The dump-capture kernel is installed and configured, but you must perform the other configuration steps.
- Images released in August 2020 or later: The image is fully configured to generate a crash dump.
- Earlier images: The dump-capture kernel is installed and configured, but you must perform the other configuration steps.
- Images released in September 2020 or later: The image is fully configured to generate a crash dump.
- Earlier images: The dump-capture kernel is installed and configured, but you must perform the other configuration steps.
- Connect to the instance.
- Install and configure the dump-capture kernel:
- Install
kdump
andkexec
by running the following command:sudo yum install kexec-tools
- Reserve memory on the kernel to save the crash dump. Do the following:
- Open the
etc/default/grub
file in a text editor. - In the line that starts with
GRUB_CMDLINE_LINUX_DEFAULT
, add the parametercrashkernel=<memory-to-reserve>
. For example, to reserve 100 MB, addcrashkernel=100M
. - Save the changes and close the file.
- Rebuild the GRUB file by running the following command:
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
- Open the
- Install
- Configure the kernel to crash when it receives a diagnostic interrupt. To do
this, open the
/etc/sysctl.conf
file in a text editor and add the following line:kernel.unknown_nmi_panic=1
- Apply the change to
/etc/sysctl.conf
by running the following command:sysctl -p
Windows Server - Platform Image
If you use a Windows Server platform image that was released in April 2020 or later, the image is already configured to generate a crash dump.
If you use an image that was released before April 2020, do the following:
- Connect to the instance.
- Download the Oracle VirtIO Drivers for Microsoft Windows.
- Install the drivers and then restart the instance.
Windows Server - Customer-Provided Image
Refer to the third-party documentation for your operating system for more information.
Sending a Diagnostic Interrupt
After you configure the instance's OS to generate a crash dump when it crashes, use the following procedures to send a diagnostic interrupt.
To send a diagnostic interrupt using the Console
- Open the navigation menu and click Compute. Under Compute, click Instances.
- Click the instance that you're interested in.
-
Click More Actions, and then click Send diagnostic interrupt.
Caution
Sending a diagnostic interrupt to a live system can cause data corruption or system failure. -
Review the confirmation message and then click Send diagnostic interrupt.
The lifecycle state that appears in the Console remains Running while the instance's OS crashes and restarts. Do not send multiple diagnostic interrupts.
- Wait several minutes for the instance's OS to restart, and then connect to the instance. You can now retrieve and analyze the crash dump.
To send a diagnostic interrupt using the API
Use the InstanceAction operation, passing the value SENDDIAGNOSTICINTERRUPT
as the action to perform.
Analyzing a Crash Dump
The crash dump is saved locally on the instance's OS.
-
Linux instances: The default location where the crash dump is saved depends on the operating system.
- Oracle Linux 8: Saved in
/var/oled/crash
. - Oracle Linux 7: For platform images released in March 2021 or later, saved in
/var/crash
. For older platform images, saved in/var/oled/crash
. - Other Linux and UNIX-like operating systems: Saved in
/var/crash/
.
To change the location, modify the
/etc/kdump.conf
file. - Oracle Linux 8: Saved in
- Windows instances: The crash dump is saved in
%SystemRoot%memory.dmp
. On most Windows systems, this isC:\Windows\memory.dmp
.
To analyze the crash dump, use a third-party tool such as the crash utility on Linux instances or WinDbg on Windows instances.