🌐Incident Response Plan (IRP)
This document is inspired by GitLab’s https://handbook.gitlab.com/handbook/security/security-operations/sirt/sec-incident-response/, https://handbook.gitlab.com/handbook/security/security-operations/sirt/severity-matrix/, and NIST’s Computer Security Incident Handling Guide.
Computer Security Incident Response Team (CSIRT)
More information about our CSIRT can be found at 🌐Computer Security Incident Response Team (CSIRT).
Confidentiality
Incidents may contain sensitive, confidential, or private information. Such details must not be disclosed internally or externally, except when necessary to communicate with affected individuals or relevant authorities.
After the incident has been resolved, the information may be shared publicly, such as in a blog or other format, provided it receives approval from the Head of Security.
Plan Review & Update
The incident response plan undergoes an annual review to ensure its effectiveness in addressing threats and organizational needs. The review involves the Security, Compliance, Legal, Engineering, and Infrastructure teams to identify gaps or outdated elements. The plan is also tested through yearly simulations and tabletop exercises to assess its response to various incidents, identify improvement areas, and refine strategies for better incident management.
The three main experiment scenarios are:
Data leakage (credentials, internal/customer data)
Asset compromise (unauthorized access, malware infection)
Exploited vulnerabilities
Roles & Responsibilities
Role | Responsibilities |
---|---|
Computer Security Incident Response Team (CSIRT) | Manages and coordinates the organization's response to incidents and ensures effective communication throughout the incident lifecycle. |
Incident Commander/Coordinator | Central point of contact for incident-related communication and coordination. Responsibilities include assessing incident severity, activating the CSIRT, and overseeing the overall incident response process. |
Legal Representative | Offers legal expertise and guidance during incidents, especially those involving data breaches, compliance, or legal implications. |
Data Protection Officer (DPO) | Ensures incident responses comply with data protection laws, assesses impact on privacy, and guides coordination across technical, legal, and facilitates proper communication with authorities. |
PR Representative | They develop key messages, coordinate media interactions, and manage the organization's public image. Provide guidance and support to the spokesperson. |
Spokesperson | The individual responsible for representing the organization and communicating the matter externally. This person may vary depending on the type of crisis. |
Department Managers and Heads | Communicate incident updates and instructions to their respective teams. They ensure their teams are aware of the incident, follow procedures, and report relevant information to the Incident Coordinator. |
Executive Team/Sponsor | Provides strategic direction and decision-making authority during major incidents. Responsible for designating the spokesperson. They communicate with key stakeholders and protect the organization's reputation and interests. |
Incident Definition
An incident is an unexpected and disruptive event, occurrence, or situation that deviates from normal operations or expectations and requires a response to address and resolve it effectively. Incidents can vary widely in nature and can impact individuals, organizations, systems, or the environment.
Types of incidents communicated by Rocket.Chat:
Events: refers to irregular and unexpected events within a system, network or environment with potential to affect performance for long periods of time. Triggered in case Status page becomes unavailable.
Security Incidents: refers to specific events that indicate or involve a security breach, unauthorized access, or disruption of the confidentiality, integrity, or availability of information or resources. Security incidents can include attempted or successful cyberattacks, malware infections, system intrusions, unauthorized access to sensitive data, and other malicious activities.
Privacy Incidents: pertain to the unauthorized use or disclosure of regulated data, like personally identifiable information or protected health information. If the data involved in a security incident is regulated, the security incident is “up-leveled” to a privacy incident. Privacy incidents typically focus on incidents that affect individual privacy rights and may or may not result in a data breach.
Data breaches: incidents where sensitive, private, or confidential data is accessed, viewed, stolen, or otherwise compromised by unauthorized individuals.
Incident Reporting
Incident investigations are initiated whenever a security event is detected. Once triggered, the Security team carefully evaluates and triages the investigation. If there is sufficient evidence to suggest the event is a false positive, the investigation is dismissed. However, if the Security team determines that the event is a legitimate security incident, the formal incident response process is activated to address and manage the situation.
Incidents can be reported internally, by any rocketeers, or externally by sending an email to security@rocket.chat. The Security team can also proactively detect an incident by performing threat hunting.
Process Overview
Detection
The CSIRT, internal, or external parties identifies a Security or Privacy event that may be considered as an incident.
Automated tools detect an incident and send an alert to #security-alerts.
Analysis
The CSIRT determines whether the reported issue is an incident or a false positive.
If the reported issue is considered an incident, severity and priority levels are defined.
Containment
The CSIRT will engage in preventing the spread of the malicious activity.
The CSIRT will mitigate and minimize the threat until it can fully remediate it.
Systems affected by the security incident may be temporarily disabled or segregated.
Eradication
Attackers' access to Rocket.Chat’s environments is removed.
Components involved in the incident are eliminated.
Controls are strengthened or implemented to ensure the root cause is fully remediated
Recovery
Efforts to fully restore the affected systems' operations after the problem has been remediated.
Monitoring controls may be created.
Post-Incident analysis and activities
Document lessons learned in the security incident’s post-mortem.
Determining Severity
Severity reflects the actual or potential impact of an incident and helps determine its priority. For example, a critical-severity incident requires urgent attention, while a low-severity incident has a lower priority.
Once triage is complete, the highest assessed severity should remain unchanged. If a severity level is later found to be inaccurate, the adjustment must be documented in the post-mortem, including the reasoning behind the change and how the revised assessment was reached.
Impact | Example |
---|---|
Critical | Failure to take immediate action is likely to result in a significant breach of data confidentiality, integrity, or availability. Existing controls are insufficient to mitigate the risk effectively. The impact could extend to a large number of customers or pose a severe threat to the company’s reputation, such as widespread service disruption, legal ramifications, or loss of customer trust. |
High | A breach of data confidentiality, integrity, or availability is likely if one or more security controls are bypassed. While existing controls mitigate the immediate risk, the potential for customer impact remains significant. The situation also carries a high risk to the company’s reputation, whether through service disruption, regulatory concerns, or loss of trust. |
Moderate | Impact on production data confidentiality, integrity, or availability is unlikely unless multiple security controls fail or are bypassed. Existing controls effectively mitigate the initial risk. There is no direct customer impact, and any reputational risk is moderate. |
Low | There is no risk to production data confidentiality, integrity, or availability unless multiple security controls fail or are compromised. Existing controls are sufficient to mitigate potential threats. There is no risk of customer impact, and reputational risk is minimal or highly unlikely. |
Severity Considerations
When assessing the severity of a security incident, we take several factors into account to determine its impact. Each time we face an incident, we evaluate the scope and exposure of the risk, the level of confidentiality involved, and more. By breaking down the issue into smaller, more manageable sub-issues, we can more effectively assess its severity. Here are some key considerations:
How many of the following areas of the CIA Triad apply?
Confidentiality
Integrity
Availability
Affected Surface - What is the affected surface?
Rocket.Chat infrastructure
Customer data
Cloud accounts
One particular application
A customer’s instance
Rocketeers' workstations
Rocket.Chat’s security systems (SIEM, MDM, EDR, IAM, etc).
Exploitability - How easy is it to exploit the issue?
Very easily - Attacker only needs to run a simple command or script to trigger the issue.
Requires some effort - Attacker needs specific conditions, such as another user being logged in, to exploit the issue.
Difficult - Attacker must meet specific conditions that are challenging to achieve.
Visibility - Who can see this issue?
Everyone
Someone with specific access
Only me
The more areas of the CIA Triad that are impacted, along with the significance of the Affected Surface, Exploitability, and Visibility, should guide the estimation of Severity. The severity doesn't need to be perfectly accurate, nor should security engineers spend excessive time debating its exactness. It should be a sufficient approximation to help us understand the level of urgency required.
Incident Tracking
All security incidents must be tracked in the following Jira board: https://rocketchat.atlassian.net/jira/core/projects/CSIRT/board. Once a Jira issue has been created for an incident, all supporting issues - e.g. HackerOne reports - must be linked to it.
Documentation
Once an incident is reported and confirmed, an official document may be created to compile all relevant information. This document, known as a "post-mortem," should begin being drafted at the start of the incident response and shared based on its confidentiality level.
First, create a new folder here: https://drive.google.com/drive/folders/1kf2FF2_IjeZUsjhDPVHmow8MxiNvX5pb. The folder should follow this name convention:
<YYYY-MM-DD - [Type of incident] - Incident description
Then, a document should be created inside the folder with the same name as the folder. The following template can be used:
Privileged and Confidential - <date> - Postmortem template
Collecting evidences
All digital evidence needs to be properly collected and stored in a way that they can't be tampered. Here you will find the recommendations for different types of evidences and sources.
File evidence
If the evidence is a file - e.g. a malware binary - follow these steps:
Make a copy of it.
Calculate and compare the HASH of both and write it to the post-mortem. Remember to explicitly write the algorithm used (MD5, SHA1, etc).
Create a folder called "Evidences" inside the incident folder.
Upload the copy of the file to this folder and make it read-only for everyone but you.
Single document stored in the SIEM
The logs collected by the Elastic SIEM already prevent file tampering, but we should create a copy and store it with the other evidence as well.
Click in the "Expand" arrow in the top left corner of the Document you found in the Kibana dashboard
In the top right corner click in "View Single Document"
Click in JSON
In the top right corner click in "Copy to clipboard"
Paste the content in a text file and save it as a JSON.
Follow the same steps as in File Evidence
Note that any document can be found in SIEM using the _index and _id fields that will be present in the JSON saved, so it is possible to confirm the file is original.
Multiple documents in the SIEM
The first step is to create an Investigate Timeline in the Kibana app and put all the documents and events in that time. Then follow the steps:
Save the Timeline with all relevant documents and events
In the Kibana app go to Security -> Timelines
Click in the 3 dots in the right part of the timeline line and in "Export Selected"
Follow the steps as in File evidence.
Internal Communication
A Rocket.Chat room with all people involved in the investigation called #incident-<incident description> must be created and the incident folder will be shared with them all. One member of the security team will be assigned as the Incident Commander and it is his responsibility to coordinate all parties to collect all information needed to solve the incident.
The CSIRT will assess the incident severity and determine the appropriate communication plan.
Regular updates should be provided to the relevant stakeholders, such as department heads, executives, and key employees, through email and internal messaging platforms.
Updates should include the current status of the incident, actions being taken, and any changes in the situation.
Communication should be clear, concise, and provide guidance to employees on how to respond or assist in mitigating the incident.
External Communication
Only authorized spokespersons, designated by the executive team, are permitted to communicate with external stakeholders.
External communication will be coordinated by the Public Relations (PR) department or a designated spokesperson.
PR will develop key messages and ensure consistency in communication across different channels.
Communication with customers, partners, and vendors will be timely and transparent, providing updates on the incident, its impact, and any steps being taken to address it.
Regulatory agencies and relevant authorities will be promptly notified as required by applicable laws and regulations.
Message Development
Messages should address the incident's impact, actions taken to mitigate it, and the organization's commitment to resolving the issue.
All messages shall be tailored to the intended audience and channels of communication (e.g., customers, employees, media) and should be reviewed and approved by the executive team or designated authority before dissemination.
Clear and consistent messaging should be maintained throughout the incident to avoid confusion and minimize misinformation.
Media Relations
If the event of the incident takes bigger proportions such as media exposure, the PR department or designated spokesperson will handle media inquiries and act as the primary point of contact for journalists.
A media monitoring system should be in place to track news coverage and social media discussions related to the incident.
Media responses should be coordinated with the CSIRT and adhere to the previously approved message.
Media interviews or press conferences will be arranged and managed by the PR department in consultation with the executive team and spokespersons should be trained in media relations, interview techniques, and handling difficult questions.
Social Media Management
In terms of social media, PR, Marketing and the CSIRT, will monitor social media platforms for mentions of the incident.
Timely responses shall be provided to inquiries and concerns raised on social media, directing users to official sources of information.
Misinformation or rumors will be addressed promptly with accurate information to prevent the spread of false information.
Social media channels will be actively used to provide updates and engage with stakeholders throughout the incident.
Communicating Authorities
When incidents occur, the affected companies must take responsibility for communication, which may include notifying relevant authorities. As a global company with customers across nearly all continents, it can be challenging to keep track of all the necessary authorities. Below is a list of the key authorities that may be contacted if an incident involving Rocket.Chat takes place.
Europe
Under GDPR (Regulation (EU) 2016/679), if a company processes personal data of individuals in the EU/EEA, they are required to notify the relevant supervisory authority within 72 hours of becoming aware of the breach. Additionally, if the breach poses a high risk to individuals' rights and freedoms, the company must also notify the affected individuals directly. A list of all European authorities can be seen at https://www.edpb.europa.eu/about-edpb/about-edpb/members_en.
Under the new Cyber Resilience Act (CRA), we may also need to notify the European Union Agency for Cybersecurity (ENISA).
Additionally, we may also want to notify the European Computer Emergency Readiness Team (EU-CERT) or specific countries' CERTs such as CERT-FR, CERT-UK, CERT-BE, and so forth. This step is not always mandatory, but notifying the relevant national or EU-level CSIRT may be beneficial, especially if the breach impacts critical infrastructure or has significant cybersecurity implications.
We will likely notify a CIRT when one or more of the following happen:
Critical Infrastructure: If the breach affects critical infrastructure or essential services, such as energy, transportation, or health systems, notifying a CERT is sometimes mandatory.
Widespread or Major Impact: If the breach is significant or involves large-scale data loss, it's a good practice to notify the CERT to help mitigate any further cyber threats or attacks.
Government Contracts: If the organization works with government bodies or has contracts that involve sensitive government data, notifying the appropriate CERT may be required.
Latin America
Similarly to the GDPR, the Brazilian equivalent, the LGPD (Lei Geral de Proteção de Dados, in Portuguese), mandates that we notify the national authority (Autoridade Nacional de Proteção de Dados or ANPD, in Portuguese) within 72 hours.
For Mexico, we may need to contact the INAI (Instituto Nacional de Transparencia, Acceso a la Información y Protección de Datos Personales, in Spanish).
For Argentina, we may need to contact the DNPDP (Dirección Nacional de Protección de Datos Personales, in Spanish).
Other LatAm countries will follow similar guidelines. We can check the specific authority in case of an incident involving data subjects from each country.
As for their CERTs, it’s also not mandatory but it can be beneficial to notify teams such as CERT.br, CERT.ar, CERT-MX, and so forth.
Asia
Japan’s Act on the Protection of Personal Information (個人情報の保護に関する法律, in Japanese), often abbreviated to APPI, mandates that organizations notify affected individuals if their personal data is breached. Japan also has a CERT which is called JPCERT/CC.
India’s Information Technology (Reasonable Security Practices and Procedures and Sensitive Personal Data or Information) Rules under the IT Act require notification of data breaches in specific cases, especially if sensitive personal data is compromised. India has CERT-In (Indian Computer Emergency Response Team), which is the government-designated body for responding to cybersecurity incidents. Under Indian law, companies are required to report cybersecurity incidents, including data breaches, to CERT-In.
Other Asian countries will follow similar guidelines. We can check the specific authority in case of an incident involving data subjects from each country.