Machine as insider threat: Lessons from Kyoto University’s backup data deletion
The director for the Academic Center for Computing and Media Studies within the Institute for Information Management and Communication at Japan’s Kyoto University, Toshio Okabe, issued an apology on December 28 to users of the supercomputing systems for losing approximately 77 terabytes of user data, which comprised approximately a mere 34 million files from 14 research groups.
The apology follows the advisory that users had received on December 16, which outlined how from December 14 to 16 a defect in the backup storage program of the supercomputer system supporting the /LARGE0 directory failed, and the data was deleted unintentionally.
The supercomputer was identified as the Nippon Hewlett-Packard GK. Hewlett Packard Enterprise (HPE) in its own apology confirmed the data loss, accepting 100% responsibility and referenced how 49 terabytes of data remain backed up on the /LARGE1 directory. HPE went on to offer compensation to those who lost their files and provided a pathway to initiate the dialogue.
How the data loss occurred
As HPE explained, a machine running the backup script had been adjusted to “improve visibility and readability,” but the script was modified and was reloaded by “overwriting” the prior script. One small glitch occurred and the resulting code was “reloaded from the middle,” which caused the deletion of the /LARGE0 directory instead of the directory of files designated for deletion.
To mitigate against a recurrence, HPE said it would:
- Fully verify code prior to applying it to the supercomputer system.
- Conduct an examination regarding the impact and suggest improvements to avoid a recurrence.
- Re-educate engineers to avoid recurrence of human error.
Kyoto University put the backup process on full-stop until the end of January.
Missing backup processes and procedures
The university’s apology explained how restoration of deleted files wasn’t possible as there was no multi-generational backup. The basic tenets of “son-father-grandfather” backup methodologies simply did not exist. Going forward, however, incremental backups would be the norm and a full backup which mirrored the original corpus would also be created.
In his apology highlighted to all users, Okabe said, “The possibility of file loss due to equipment failure or disaster [exists] so … please backup your important files to another system.”
The published notice was updated on January 4, 2022, mitigating the breadth of the data loss. As it turned out, some of the files lost did not require restoration, which resulted in a new final total loss of only 8 terabytes of important data comprising 3.5 million files. The number of affected users both from within the university and beyond who used the university’s supercomputers has landed on 68 users.
The takeaway for CISOs
Kyoto University outsourced the operation of the supercomputer center to a third party, in this instance HPE. Okabe pointedly placed the problem at the feet of HPE, while also noting there “was a problem with the operation management system at the university.” Even the existence of a single generational cold storage backup drive of 100 terabyte capacity (an investment of less than $10,000) would have been sufficient to house the prior day/week’s files and obviate the catastrophic loss of user data. Such was not the case.
What happened in Kyoto was a classic case of the machines being the insider threat. In this case, nothing was stolen. Rather the information was destroyed. There is no indication this instance was nothing more than human error. Multiple groups had their research affected, a setback no doubt. Though human error was at play, the result is identical had the code had been purposefully adjusted to fail by an unscrupulous individual, with the goal of destroying the work of a targeted entity within the affected groups.
Questions CISOs should be asking of their teams include:
- How many third-party vendors does our infrastructure rely on to keep our system and data up, secure, and accessible?
- What is the visibility into the third-party’s processes and procedures?
- How are script modifications crafted, introduced, and verified?
- What is the level of visibility into the third-party employees/contractors responsible for supporting our instance?
Full article attribution is made to its original source and author.