目录
目录README.md


Logo

Csync

A hardware-adaptive user-imperceptible backup system
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Document
  3. Getting Started
  4. Usage
  5. Contact
  6. License

About-the-project

run-qemu
Csync implements user seamless backup based on eBPF, and implements hardware based adaptive adjustment based on resource monitoring without downtime, ensuring user experience. Simultaneously achieving automated backup, users do not need to manually manage.

(back to top)

Document

  • OpenHarmony Debug method can be found here
  • Core Backup Algorithm can be found here
  • Demo Video can be found here
  • Demo PPT can be found here

(back to top)

Branches

  • Default branch is master branch, This branch contains the documents required for the final round and the preliminary round and Csync documentation.
  • The subsystem-dev branch is Csync command-line tool that runs on OpenHarmony(DAYU 200).
  • The shell-dev branch is Csync command-line tool that can run on all platforms.
  • The shell-dev-zstd branch is Csync command-line tool that provide compression function and can run on all platforms.
  • The shell-dev-timedbackup branch is Csync command-line tool that provide timed backup function and can run on all platforms.

    (back to top)

Getting Started

Prerequisites

The differential backup tool requires three open-source libraries, librsyncspdlogjsoncpplibzstd and libbpf.

Where librsync provides support for the algorithm, spdlog provides the basis for tool logging, jsoncpp is used to parse the configuration file, libzstd is used to compress backup directory and libbpf is used to develop Csync’s eBPF functions.

Usage

The usage for Csync can be seen under each branch.

(back to top)

Test

There are three directories in the test folder, each of which is described below:

  • Server_test is a test script written in bash syntax for testing locally.
  • Board_test is a test script written in sh syntax for testing on the board.

There are four shell scripts in each directory, each with the following roles:

  • equal_dir.sh: recursively determines if two directories are the same.
  • new_gen_data.sh: Creates the original test folder, folder size, number of files, folder depth, as well as creating file sizes, soft links hard links, etc. Used to test the reliability of the tool in the face of complexity.
  • new_modify_data.sh: new_modify_data.sh: Used to modify the folder to facilitate the subsequent tools to backup the folder for testing, its functions include, create new files \ folder, modify the file, as well as deleting the file \ folder, create a soft link, create a hard link, delete the soft link, delete the hard link and so on.
  • run_single_diffbackup.sh: is used to aggregate the above three scripts for multiple rounds of tool backup recovery testing.

(back to top)

Structure

.
├── doc
│   ├── Complie.md
│   ├── Debug.md
│   ├── DeltaMergeAlg.md
│   ├── usage.md
│   └── 会议纪要.md
├── LICENSE
├── media
├── README.md
├── test
│   ├── Board_test   #开发板测试脚本
│   │   ├── equal_dir.sh
│   │   ├── new_gen_data.sh
│   │   ├── new_modify_data.sh
│   │   └── run_single_diffbackup.sh
│   ├── Diff_Test
│   └── Server_test  #本机测试脚本
│       ├── equal_dir.sh
│       ├── new_gen_data.sh
│       ├── new_modify_data.sh
│       └── run_single_diffbackup.sh
├── 决赛内容
│   ├── 操作系统开源创新大赛原创承诺书.docx  #承诺书
│   ├── 演示视频.mp4                           #演示视频
│   ├── 项目功能说明书.pdf                      # 项目功能说明书 
│   └── Csync硬件自适应的用户无感知备份系统.pptx             # PPT
└── 初赛内容
    ├── 1_1_操作系统开源创新大赛原创承诺书.docx  #承诺书
    ├── 演示视频.mp4                           #演示视频
    ├── 项目功能说明书.pdf                      # 项目功能说明书 
    └── 高性能应用目录差异分析.pptx             # PPT

(back to top)

Contact

Gonggu Chen - cggwz@mail.ustc.edu.cn

Shijun Yang - yangx7@mail.ustc.edu.cn

Ziwen Pang - pzw2002@mail.ustc.edu.cn

Project Link:高性能应用目录差异分析服务 | GitLink

(back to top)

License

This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.

(back to top)


高性能应用目录差异分析服务

作品完成和提交方式:选择本赛题的参赛队伍需要首先复刻(Fork)本项目,然后在复刻的项目中添加参赛队员、合作完成作品开发即可,无需提交PR到赛题项目。如果作品为文档形式,也请将作品文档提交到项目代码库中。在作品完成过程中,围绕作品的相关讨论等可以以疑修(Issue)形式发布和讨论,也可使用里程碑对整个任务进行规划管理。

1. 赛题说明

参赛人员需要实现一个系统服务,针对特定目录识别出目录间的差异,并输出差异内容清单。 业务模型如下:

应用A有自己的独有的数据目录A,会按业务述求对该目录进行访问,并对该目录进行新增加文件或目录、删除文件或目录、修改文件等操作。 备份服务有能力访问应用A独有的数据目录A,并将相关的数据备份到备份空间。 备份行为存在全量备份和差分备份。 参赛人员需要实现一个备份服务。为了简化题目,出题方提供A1、A2、A3等多个数据目录, 分别代表是T1、T2、T3时间上应用数据目录A的状态(其中T1 < T2 < T3)。备份服务需要能够基于这几个目录分析、导出多个备份数据。 数据目录A细节信息 数据目录有大量的文件/文件夹 举例: 根目录下有文件夹1..8 文件夹1下有一个size1的文件 文件夹2下有两个文件夹(称作2阶文件夹),每个文件夹下有两个size2的文件 文件夹3下有3个文件夹(称作2阶文件夹),每个2阶文件夹下有3个文件夹(称作3阶文件夹),每个3阶文件夹下有3个size3的文件 文件夹N下有N个2阶文件夹,每个2阶文件夹下有N个3阶文件夹….一直到N阶文件夹,文件夹下有N个文件 每个文件的size不同,文件多的话每个文件的size就小,文件少的话每个文件的size就大。 为了简化,每个一阶文件夹的总大小相等。 简单计算: N=8时,文件夹8有8^8=16,777,216个文件,每个文件size为1KB,总大小为16GB。 N=1时,文件夹1有1个文件,该文件size为16GB (因为一个文件的实际占用空间不可能只有1KB,至少4KB起步,再加上文件夹的容量开销,一个文件夹最后实际占用空间肯定超过16GB,因此这里后续可能会微调) 同时为了简化审核,规定备份数据的格式: 1、一个备份数据一个目录a 2、目录下保存有一个文本文件b和一个目录c。文本文件保存了该备份的文件或目录的状态信息。每个状态信息一行,示例如下: (added)&&(path)||(startpos1 endpos1) (delete)&&(path) (modified)&&(path)||((startpos1 endpos1) || (startpos2 endpos2)|| …… ) 3、目录c下保存了具体文件的变化。修改的文件按照修改块的size直接追加到一起。

2. 赛题要求

1)全量备份(模拟全量备份) 提供数据目录A1,并对数据目录进行全量备份,注意保留相关的备份数据分析信息。 2)差分备份1(模拟普通差分备份) 提供数据目录A1和A2,需要分析数据目录A2和A1的差异,并进行差分备份,输出A2和A1间的差异数据全集。建议将大文件进行适当粒度的分块(4KB起步)。 3)差分备份2(模拟备份过程中,数据目录有修改) 提供数据目录A1、A2和A3,需要分析数据目录A2和A1的差异,输出备份数据1。然后分析A3和A2的差异,并在备份数据1的基础上叠加差异,输出备份数据2。不允许直接获取A3和A1的差异。

3. 赛题导师

易见 — salient.yijian@huawei.com 冒晶晶 — maojingjing1@huawei.com 张智伟 — weizhi.zhang@huawei.com 覃鸿巍 — qinhongwei7@huawei.com

4. 参考资料

【OpenHarmony文件管理子系统介绍】 https://gitee.com/openharmony/docs/blob/master/zh-cn/readme/%E6%96%87%E4%BB%B6%E7%AE%A1%E7%90%86%E5%AD%90%E7%B3%BB%E7%BB%9F.md https://gitee.com/openharmony/filemanagement_app_file_service 推荐硬件 Rk3568开发板

(back to top)

关于
143.8 MB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

©Copyright 2023 CCF 开源发展委员会
Powered by Trustie& IntelliDE 京ICP备13000930号