[Debug tool] Commit slider (#14571)

This commit is contained in:
Yury Gaydaychuk 2023-02-20 07:46:43 +01:00 committed by GitHub
parent 749ff8c93f
commit 7cffe848d6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
13 changed files with 1186 additions and 0 deletions

2
.gitignore vendored
View File

@ -57,3 +57,5 @@ __pycache__
/tools/mo/*.mapping
/tools/mo/*.dat
/tools/mo/*.svg
/src/plugins/intel_cpu/tools/commit_slider/*.json
/src/plugins/intel_cpu/tools/commit_slider/slider_cache/*

View File

@ -0,0 +1,148 @@
# Commit slider tool
Tool for automatic iteration over commit set with provided operation. For example, binary search with given cryterion (check application output, compare printed blobs, etc.)
## Prerequisites
git >= *2.0*
cmake >= OpenVino minimum required version ([CMakeLists.txt](../../../../CMakeLists.txt))
python >= *3.6*
ccache >= *3.0*
## Preparing
1. Install **CCache**:
`sudo apt install -y ccache`
`sudo /usr/sbin/update-ccache-symlinks`
`echo 'export PATH="/usr/lib/ccache:$PATH"' | tee -a ~/.bashrc`
`source ~/.bashrc && echo $PATH`
2. Check if **Ccache** installed via `which g++ gcc`
3. Run `sudo sh setup.sh`
## Setup custom config
*custom_cfg.json* may override every field in general *util/cfg.json*. Here are the most necessary.
1. Define `makeCmd` - build command, which you need for your application.
2. Define `commandList`. Adjust *commandList* if you need more specific way to build target app. More details in [Custom command list](#ccl).
3. Replace `gitPath, buildPath` if your target is out of current **Openvino** repo.
4. Set `appCmd, appPath` (mandatory) regarding target application
5. Set up `runConfig` (mandatory):
5.1. `getCommitListCmd` - *git* command, returning commit list *if you don't want to set commit intervals with command args*
5.2. `mode` = `{checkOutput|bmPerfMode|compareBlobs|<to_extend>}` - cryterion of commit comparation
5.3. `traversal` `{firstFailedVersion|firstFixedVersion|allBreaks|<to_extend>}` - traversal rule
5.4. `preprocess` if you need preparation before commit building `<add_details>`.
5.5. Other fields depend on mode, for example, `stopPattern` for `checkOutput` is *RegEx* pattern for application failed output.
6. Setup environment variables via *envVars* field in a format:
`[{"name" : "key1", "val" : "val1"}, {"name" : "key2", "val" : "val2"}]`
## Run commit slider
`python3 commit_slider.py {-c commit1..commit2 | -cfg path_to_config_file}`
`-c` overrides `getCommitListCmd` in *cfg.json*
## Output
In common case, the output looks like
```
<build details>
Break commit: "hash_1", <details>
Break commit: "hash_2", <details>
<...>
Break commit: "hash_N", <details>
```
For every *x* *hash_x* means hash of commit, caused break, i.e. previous commit is "good" in a sense of current Mode, and *hash_x* is "bad". `<details>` determined by Mode. Common log and log for every commit are located in *log/*. If `printCSV` flag is enabled, *log/* contains also *report.csv*.
## Examples
### Command line
`python3 commit_slider.py`
`python3 commit_slider.py -c e29169d..e4cf8ae`
`python3 commit_slider.py -c e29169d..e4cf8ae -cfg my_cfg.json`
### Predefined configurations
There are several predefined configurations in *utils/cfg_samples* folder, which may me copied to *custom_cfg.json*. In every example `<start_commit>..<end_commit>` means interval to be analized. All examples illusrate the simpliest binary search.
###### Performance task
Searching of performance degradation of *benchmark_app*.
*bm_perf.json*
`<model_path>` - path to model for benchmarking, `perfAppropriateDeviation` may be changed to make acceptance condition more strict or soft.
###### Comparation of blobs
Checking of accuracy degradation via blob comparation.
*blob.cmp.json*
`<model_path>` - path to model, `<blob_dir>` - directory for printing blobs, `<out_blob_name_tail>` - pattern of blob name, not including node id, for example *Output_Reshape_2-0_in0*, `limit` of linewise difference may be changed or zeroed for bitwise comparation, `OV_CPU_BLOB_DUMP_NODE_TYPE` corresponds required node type, other dumping parameters may be also used.
###### Check output
Searching of failing of *benchmark_app*.
*custom_cfg.json*
`<model_path>` - path to model, `<bm_error_message>` - typical error message or part of it, e.g. *fail*.
###### Integration with e2e
Checking of accuracy degradation by *e2e*. `<e2e_path>` - path to e2e directory, `<e2e_args>` - parameters for e2e, `<ov_path>` - absolute path to *ov*, `<e2e_error_message>` - e2e error message.
## Possible troubles
In the case of unexpected failing, you can check */tmp/commit_slider/log/*
###### Insufficient build commandset
If some commit cannot be builded, you can extend command set in custom command list. The example of custom commandlist is below:
```
"commandList" : [
{"cmd" : "git rm --cached -r .", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git reset --hard", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git rm .gitattributes", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git reset .", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git checkout -- .", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git rm --cached -r .", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git reset --hard", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git rm .gitattributes", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git reset .", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "git checkout -- .", "path" : "{gitPath}"},
{"cmd" : "git clean -fxd", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "mkdir -p build", "path" : "{gitPath}"},
{"cmd" : "git checkout {commit}", "catchMsg" : "error", "path" : "{gitPath}"},
{"cmd" : "git submodule init", "path" : "{gitPath}"},
{"cmd" : "git submodule update --recursive", "path" : "{buildPath}"},
{"cmd" : "{makeCmd}", "catchMsg" : "CMake Error", "path" : "{buildPath}"},
{"cmd" : "make --jobs=4", "path" : "{buildPath}"},
{"cmd" : "git checkout -- .", "path" : "{gitPath}"}
]
```
More details in [Custom command list](#ccl).
###### Application failed with another or output wasn't parsed correctly
Sometimes, target bug is covered by another unexpected bug. In this case, it's reasonable to extend error pattern, like {err_msg_1|err_msg_2} or look for new problem with separate run.
## Implementing custom mode
1. override `checkIfBordersDiffer(i1, i2, list)` to define, if given commits differs in terms of given criterion.
2. override `createCash(), getCommitIfCashed(commit), getCommitCash(commit, valueToCache)` if predefined cashing via json map is insufficient for current task.
3. `checkCfg()` - checking if customer provided all necessary parameters.
4. `setOutputInfo(commit), getResult()` for setting and interpretation of parameters of founded commit.
## Implementing custom traversal rule
To implement new `Traversal`, override `bypass(i1, i2, list, cfg, commitPath)` method, using `checkIfBordersDiffer(i1, i2, list)` from Mode to decide, whether desired commit is lying inside given interval.
## Implementing custom preprocess
1. create *utils/preprocess/your_custom_pp.py* file.
2. define `def your_custom_pp(cfg): <...>` function with implementation of subprocess.
3. add `"preprocess" : { "name" : your_custom_pp, <other parameters>` to `runConfig` and `{"tag" : "preprocess"}` to build command list.
## <a name="ccl"></a>Custom command list
The structure of build command is
```
"commandList" : [
  {
"cmd" : "git rm --cached -r .",
"path" : "directory where command is to be runned",
"tag" : "is used for managing of command flow (clean, preprocessing)",
"catchMsg" : "RegEx, like (.)*(error|failed|wrong executor)(.)*"
},
<...>
]
```
*cmd* - command to run, e.g. `git rm --cached -r .`, *path* - command directory, commonly git root or build directory, *tag* - necessary to check, if command should be executed with some special conditions, commonly `preprocess` or `clean`, *catchMsg* - string to check output, necessary because of unreliability of exceptions handling in python subprocess API.

View File

@ -0,0 +1,70 @@
import subprocess
import os
import shutil
import sys
from distutils.dir_util import copy_tree
from utils.helpers import safeClearDir, getParams
args, cfgData, customCfgPath = getParams()
if args.__dict__["isWorkingDir"]:
# rerun script from work directory
from utils.modes import Mode
from utils.helpers import CfgError
from utils.helpers import checkArgAndGetCommits
commitList = []
if args.__dict__["commitSeq"] is None:
if "getCommitListCmd" in cfgData["runConfig"]["commitList"]:
commitListCmd = cfgData["runConfig"]["commitList"]
commitListCmd = commitListCmd["getCommitListCmd"]
cwd = cfgData["gitPath"]
try:
out = subprocess.check_output(commitListCmd.split(), cwd=cwd)
except subprocess.CalledProcessError as e:
msg = "Commit list command caused error"
raise CfgError("{msg} {e}".format(msg=msg, e=str(e)))
out = out.decode("utf-8")
commitList = out.split()
else:
raise CfgError("Commit list is mandatory")
else:
commitList = checkArgAndGetCommits(args.__dict__["commitSeq"], cfgData)
commitList.reverse()
p = Mode.factory(cfgData)
p.run(0, len(commitList) - 1, commitList, cfgData)
p.getResult()
else:
workPath = cfgData["workPath"]
if not os.path.exists(workPath):
os.mkdir(workPath)
else:
safeClearDir(workPath)
curPath = os.getcwd()
copy_tree(curPath, workPath)
scriptName = os.path.basename(__file__)
argString = " ".join(sys.argv)
formattedCmd = "{py} {workPath}/{argString} -wd".format(
py=sys.executable, workPath=workPath, argString=argString
)
subprocess.call(formattedCmd.split())
# copy logs and cache back to general repo
tempLogPath = cfgData["logPath"].format(workPath=workPath)
permLogPath = cfgData["logPath"].format(workPath=curPath)
safeClearDir(permLogPath)
copy_tree(tempLogPath, permLogPath)
tempCachePath = cfgData["cachePath"].format(workPath=workPath)
permCachePath = cfgData["cachePath"].format(workPath=curPath)
safeClearDir(permCachePath)
copy_tree(tempCachePath, permCachePath)
shutil.copyfile(
os.path.join(workPath, customCfgPath),
os.path.join(curPath, customCfgPath),
follow_symlinks=True,
)
safeClearDir(workPath)

View File

@ -0,0 +1,5 @@
#!/bin/bash
sudo mkdir -p /tmp/commit_slider_tool
sudo chmod a+trwx /tmp/commit_slider_tool
touch custom_cfg.json
chmod 664 custom_cfg.json

View File

@ -0,0 +1,41 @@
{
"modeMap" : {
"checkOutput" : "CheckOutputMode",
"bmPerf" : "BenchmarkAppPerformanceMode",
"compareBlobs" : "CompareBlobsMode"
},
"traversalMap" : {
"firstFailedVersion" : "FirstFailedVersion",
"firstFixedVersion" : "FirstFixedVersion",
"allBreaks" : "AllBreakVersions",
"checkCommitSet" : "IterateOverSuspiciousCommits",
"bruteForce" : "BruteForce"
},
"commandList" : [
{"cmd" : "git checkout -- .", "path" : "{gitPath}"},
{"cmd" : "git clean -fxd", "path" : "{gitPath}", "tag" : "clean"},
{"cmd" : "mkdir -p build", "path" : "{gitPath}"},
{"cmd" : "git checkout {commit}", "catchMsg" : "error", "path" : "{gitPath}"},
{"cmd" : "git submodule init", "path" : "{gitPath}"},
{"cmd" : "git submodule update --recursive", "path" : "{buildPath}"},
{"cmd" : "{makeCmd}", "catchMsg" : "CMake Error", "path" : "{buildPath}"},
{"cmd" : "make --jobs=4", "path" : "{buildPath}"},
{"cmd" : "git checkout -- .", "path" : "{gitPath}"}
],
"makeCmd" : "cmake ..",
"returnCmd" : "git checkout master",
"gitPath" : "../../../../../",
"appPath" : "../../../../../bin/intel64/Release/",
"buildPath" : "../../../../../build/",
"cachePath" : "{workPath}/slider_cache/",
"logPath" : "{workPath}/log/",
"workPath" : "/tmp/commit_slider_tool",
"clearCache" : false,
"noCleanInterval" : 10,
"checkIfBordersDiffer" : false,
"printCSV" : false,
"usePrevRunCache" : false,
"serviceConfig" : {
"comment" : "For inner purpose. Data will be overwritten during script running."
}
}

View File

@ -0,0 +1,20 @@
{
"makeCmd" : "cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_PYTHON=OFF -DTHREADING=TBB -DENABLE_MKL_DNN=ON -DENABLE_CLDNN=OFF -DENABLE_INTEL_GNA=OFF -DENABLE_INTEL_VPU=OFF -DENABLE_INTEL_MYRIAD=OFF -DENABLE_INTEL_MYRIAD_COMMON=OFF -DENABLE_HDDL=OFF -DENABLE_MODELS=OFF -DENABLE_SAMPLES=ON -DENABLE_TESTS=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_CPU_DEBUG_CAPS=ON -DENABLE_DEBUG_CAPS=ON -DENABLE_OV_CORE_BACKEND_UNIT_TESTS=OFF -DENABLE_OPENVINO_DEBUG=ON ..",
"appCmd" : "./benchmark_app -m <model_path> -t 10",
"envVars" : [
{"name" : "OV_CPU_BLOB_DUMP_NODE_TYPE", "val" : "Output"},
{"name" : "OV_CPU_BLOB_DUMP_FORMAT", "val" : "TEXT"},
{"name" : "OV_CPU_BLOB_DUMP_DIR", "val" : "<blob_dir>"}
],
"runConfig" : {
"commitList" : {
"getCommitListCmd" : "git log <start_commit>..<end_commit> --boundary --pretty=\"%h\""
},
"mode" : "compareBlobs",
"traversal" : "firstFailedVersion",
"outputFileNamePattern" : "(.)*<out_blob_name_tail>.ieb$",
"outputDirectory" : "<blob_dir>",
"limit" : 0.05
}
}

View File

@ -0,0 +1,12 @@
{
"appCmd" : "./benchmark_app -m <model_path> -d CPU -t 10",
"makeCmd" : "cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_PYTHON=OFF -DTHREADING=TBB -DENABLE_MKL_DNN=ON -DENABLE_CLDNN=OFF -DENABLE_INTEL_GNA=OFF -DENABLE_INTEL_VPU=OFF -DENABLE_INTEL_MYRIAD=OFF -DENABLE_INTEL_MYRIAD_COMMON=OFF -DENABLE_HDDL=OFF -DENABLE_MODELS=OFF -DENABLE_SAMPLES=ON -DENABLE_TESTS=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_CPU_DEBUG_CAPS=OFF -DENABLE_DEBUG_CAPS=OFF -DENABLE_OV_CORE_BACKEND_UNIT_TESTS=OFF -DENABLE_OPENVINO_DEBUG=OFF ..",
"runConfig" : {
"commitList" : {
"getCommitListCmd" : "git log <start_commit>..<end_commit> --boundary --pretty=\"%h\""
},
"mode" : "checkOutput",
"traversal" : "firstFailedVersion",
"stopPattern" : "(.)*<bm_error_message>(.)*"
}
}

View File

@ -0,0 +1,12 @@
{
"appCmd" : "./benchmark_app -m <model_path> -d CPU -hint throughput -inference_only=false -t 60",
"makeCmd" : "cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_PYTHON=OFF -DTHREADING=TBB -DENABLE_MKL_DNN=ON -DENABLE_CLDNN=OFF -DENABLE_INTEL_GNA=OFF -DENABLE_INTEL_VPU=OFF -DENABLE_INTEL_MYRIAD=OFF -DENABLE_INTEL_MYRIAD_COMMON=OFF -DENABLE_HDDL=OFF -DENABLE_MODELS=OFF -DENABLE_SAMPLES=ON -DENABLE_TESTS=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_CPU_DEBUG_CAPS=OFF -DENABLE_DEBUG_CAPS=OFF -DENABLE_OV_CORE_BACKEND_UNIT_TESTS=OFF -DENABLE_OPENVINO_DEBUG=OFF ..",
"runConfig" : {
"commitList" : {
"getCommitListCmd" : "git log <start_commit>..<end_commit> --boundary --pretty=\"%h\""
},
"mode" : "bmPerf",
"traversal" : "firstFailedVersion",
"perfAppropriateDeviation" : 0.05
}
}

View File

@ -0,0 +1,19 @@
{
"appPath" : "/<e2e_path>/e2e/frameworks.ai.openvino.tests/e2e_oss/",
"appCmd" : "pytest test_dynamism.py <e2e_args>",
"envVars" : [
{"name" : "PYTHONPATH", "val" : "/<ov_path>/bin/intel64/Release/python_api/python3.8/"},
{"name" : "LD_LIBRARY_PATH", "val" : "/<ov_path>/bin/intel64/Release/"},
{"name" : "MO_ROOT", "val" : "/<ov_path>/tools/mo/openvino/tools/"},
{"name" : "OPENVINO_ROOT_DIR", "val" : "/<ov_path>/"}
],
"makeCmd" : "cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_PYTHON=ON -DPYTHON_EXECUTABLE=/usr/bin/python3.8 -DPYTHON_LIBRARY=/usr/lib/x86_64-linux-gnu/libpython3.8.so -DPYTHON_INCLUDE_DIR=/usr/include/python3.8 -DTHREADING=TBB -DENABLE_MKL_DNN=ON -DENABLE_CLDNN=OFF -DENABLE_INTEL_GNA=OFF -DENABLE_INTEL_VPU=OFF -DENABLE_INTEL_MYRIAD=OFF -DENABLE_INTEL_MYRIAD_COMMON=OFF -DENABLE_HDDL=OFF -DENABLE_MODELS=OFF -DENABLE_SAMPLES=OFF -DENABLE_TESTS=OFF -DENABLE_CPU_DEBUG_CAPS=OFF -DENABLE_HETERO=OFF -DENABLE_TEMPLATE=OFF -DENABLE_CPU_DEBUG_CAPS=OFF -DENABLE_DEBUG_CAPS=OFF -DENABLE_OV_CORE_BACKEND_UNIT_TESTS=OFF -DENABLE_OPENVINO_DEBUG=OFF ..",
"runConfig" : {
"commitList" : {
"getCommitListCmd" : "git log <start_commit>..<end_commit> --boundary --pretty=\"%h\""
},
"mode" : "checkOutput",
"traversal" : "firstFailedVersion",
"stopPattern" : "(.)*<e2e_error_message>(.)*"
}
}

View File

@ -0,0 +1,285 @@
from abc import ABC
import utils.helpers as util
import json
import os
from enum import Enum
import csv
class Mode(ABC):
@staticmethod
def factory(cfg):
modeClassName = util.checkAndGetClassnameByConfig(
cfg, "modeMap", "mode"
)
cl = util.checkAndGetSubclass(modeClassName, Mode)
return cl(cfg)
def __init__(self, cfg) -> None:
self.checkCfg(cfg)
self.commitPath = self.CommitPath()
traversalClassName = util.checkAndGetClassnameByConfig(
cfg, "traversalMap", "traversal"
)
traversalClass = util.checkAndGetSubclass(
traversalClassName, self.Traversal
)
self.traversal = traversalClass(self)
self.cfg = cfg
logPath = util.getActualPath("logPath", cfg)
self.commonLogger = util.setupLogger(
"commonLogger", logPath, "common_log.log"
)
def createCash(self):
# In common case we use json.
# Create cash is overrided if we need special algo for caching.
cp = util.getActualPath("cachePath", self.cfg)
if not os.path.exists(cp):
os.makedirs(cp)
self.cachePath = os.path.join(cp, "check_output_cache.json")
initCacheMap = {}
try:
cacheDump = open(self.cachePath, "r+")
if self.cfg["clearCache"]:
cacheDump.truncate(0)
json.dump(initCacheMap, cacheDump)
else:
try:
json.load(cacheDump)
except json.decoder.JSONDecodeError:
json.dump(initCacheMap, cacheDump)
except FileNotFoundError:
cacheDump = open(self.cachePath, "w")
json.dump(initCacheMap, cacheDump)
cacheDump.close()
def getCommitIfCashed(self, commit):
with open(self.cachePath, "r") as cacheDump:
cacheData = json.load(cacheDump)
cacheDump.close()
if commit in cacheData:
return True, cacheData[commit]
else:
return False, None
def setCommitCash(self, commit, valueToCache):
isCommitCashed, _ = self.getCommitIfCashed(commit)
if isCommitCashed:
raise util.CashError("Commit already cashed")
else:
with open(self.cachePath, "r+", encoding="utf-8") as cacheDump:
cacheData = json.load(cacheDump)
cacheData[commit] = valueToCache
cacheDump.seek(0)
json.dump(cacheData, cacheDump, indent=4)
cacheDump.truncate()
cacheDump.close()
def checkCfg(self, cfg):
if not ("traversal" in cfg["runConfig"]):
raise util.CfgError("traversal is not configured")
def prepareRun(self, i1, i2, list, cfg):
cfg["serviceConfig"] = {}
if cfg["checkIfBordersDiffer"] and not self.checkIfListBordersDiffer(
list, cfg):
raise util.RepoError("Borders {i1} and {i2} doesn't differ".format(
i1=i1, i2=i2))
self.commitList = list
def postRun(self, list):
util.returnToActualVersion(self.cfg)
if "printCSV" in self.cfg and self.cfg["printCSV"]:
fields = ['linId', 'logId', 'hash', 'value']
rows = []
linearId = 0
logId = 0
for item in list:
item = item.replace('"', "")
isCommitCashed, value = self.getCommitIfCashed(item)
if isCommitCashed:
row = [linearId, logId, item, value]
rows.append(row)
logId = logId + 1
linearId = linearId + 1
reportPath = util.getActualPath("logPath", self.cfg)
reportPath = os.path.join(reportPath, "report.csv")
with open(reportPath, 'w') as csvfile:
csvwriter = csv.writer(csvfile)
csvwriter.writerow(fields)
csvwriter.writerows(rows)
def run(self, i1, i2, list, cfg) -> int:
self.prepareRun(i1, i2, list, cfg)
self.traversal.bypass(
i1, i2, list, cfg, self.commitPath
)
self.postRun(list)
def setOutputInfo(self, pathCommit):
# override if you need more details in output representation
pass
def getResult(self):
# override if you need more details in output representation
for pathcommit in self.commitPath.getList():
print("Break commit: {c}".format(
c=self.commitList[pathcommit.id])
)
def checkIfBordersDiffer(self, i1, i2, list, cfg):
raise NotImplementedError("checkIfBordersDiffer() is not implemented")
def checkIfListBordersDiffer(self, list, cfg):
return self.checkIfBordersDiffer(0, len(list) - 1, list, cfg)
class CommitPath:
def __init__(self):
self.commitList = []
def accept(self, traversal, commitToReport) -> None:
traversal.visit(self, commitToReport)
class CommitState(Enum):
BREAK = 1
SKIPPED = 2
class PathCommit:
def __init__(self, id, state):
self.id = id
self.state = state
def append(self, commit):
self.commitList.append(commit)
def pop(self):
return self.commitList.pop(0)
def getList(self):
return self.commitList
class Traversal(ABC):
def bypass(self, i1, i2, list, cfg, commitPath) -> int:
raise NotImplementedError()
def visit(self, cPath, commitToReport):
cPath.append(commitToReport)
def prepBypass(self, i1, i2, list, cfg):
skipInterval = cfg["noCleanInterval"]
cfg["serviceConfig"]["skipCleanInterval"] = i2 - i1 < skipInterval
self.mode.commonLogger.info(
"Check interval {i1}..{i2}".format(i1=i1, i2=i2)
)
self.mode.commonLogger.info(
"Check commits {c1}..{c2}".format(c1=list[i1], c2=list[i2])
)
def __init__(self, mode) -> None:
self.mode = mode
class FirstFailedVersion(Traversal):
def __init__(self, mode) -> None:
super().__init__(mode)
def bypass(self, i1, i2, list, cfg, commitPath) -> int:
self.prepBypass(i1, i2, list, cfg)
sampleCommit = 0
if "sampleCommit" in cfg["serviceConfig"]:
sampleCommit = cfg["serviceConfig"]["sampleCommit"]
if i1 + 1 >= i2:
isBad = self.mode.checkIfBordersDiffer(
sampleCommit, i1, list, cfg)
breakCommit = i1 if isBad else i2
pc = Mode.CommitPath.PathCommit(
breakCommit,
Mode.CommitPath.CommitState.BREAK
)
self.mode.setOutputInfo(pc)
commitPath.accept(self, pc)
return
mid = (int)((i1 + i2) / 2)
isBad = self.mode.checkIfBordersDiffer(
sampleCommit, mid, list, cfg)
if isBad:
self.bypass(
i1, mid, list, cfg, commitPath
)
else:
self.bypass(
mid, i2, list, cfg, commitPath
)
class FirstFixedVersion(Traversal):
def __init__(self, mode) -> None:
super().__init__(mode)
def bypass(self, i1, i2, list, cfg, commitPath) -> int:
self.prepBypass(i1, i2, list, cfg)
sampleCommit = 0
if "sampleCommit" in cfg["serviceConfig"]:
sampleCommit = cfg["serviceConfig"]["sampleCommit"]
if i1 + 1 >= i2:
isBad = self.mode.checkIfBordersDiffer(
sampleCommit, i1, list, cfg)
breakCommit = i2 if isBad else i1
pc = Mode.CommitPath.PathCommit(
breakCommit,
Mode.CommitPath.CommitState.BREAK
)
self.mode.setOutputInfo(pc)
commitPath.accept(self, pc)
return
mid = (int)((i1 + i2) / 2)
isBad = self.mode.checkIfBordersDiffer(
sampleCommit, mid, list, cfg)
if isBad:
self.bypass(
mid, i2, list, cfg, commitPath
)
else:
self.bypass(
i1, mid, list, cfg, commitPath
)
class AllBreakVersions(Traversal):
def __init__(self, mode) -> None:
super().__init__(mode)
def bypass(self, i1, i2, list, cfg, commitPath) -> int:
self.prepBypass(i1, i2, list, cfg)
sampleCommit = 0
if "sampleCommit" in cfg["serviceConfig"]:
sampleCommit = cfg["serviceConfig"]["sampleCommit"]
if i1 + 1 >= i2:
isBad = self.mode.checkIfBordersDiffer(
sampleCommit, i1, list, cfg)
breakCommit = i1 if isBad else i2
pc = Mode.CommitPath.PathCommit(
breakCommit,
Mode.CommitPath.CommitState.BREAK
)
self.mode.setOutputInfo(pc)
commitPath.accept(self, pc)
lastCommit = len(list) - 1
isTailDiffer = self.mode.checkIfBordersDiffer(
breakCommit, lastCommit, list, cfg)
if isTailDiffer:
cfg["serviceConfig"]["sampleCommit"] = breakCommit
self.bypass(
breakCommit, lastCommit,
list, cfg, commitPath
)
return
mid = (int)((i1 + i2) / 2)
isBad = self.mode.checkIfBordersDiffer(
sampleCommit, mid, list, cfg)
if isBad:
self.bypass(
i1, mid, list, cfg, commitPath
)
else:
self.bypass(
mid, i2, list, cfg, commitPath
)

View File

@ -0,0 +1,304 @@
import importlib
import os
import sys
import subprocess
import re
import json
import logging as log
from argparse import ArgumentParser
def getMeaningfullCommitTail(commit):
return commit[:7]
def getParams():
parser = ArgumentParser()
parser.add_argument("-c", "--commits", dest="commitSeq", help="commit set")
parser.add_argument(
"-cfg",
"--config",
dest="configuration",
help="configuration source",
default="custom_cfg.json",
)
parser.add_argument(
"-wd",
"--workdir",
dest="isWorkingDir",
action="store_true",
help="flag if current directory is working",
)
args = parser.parse_args()
presetCfgPath = "utils/cfg.json"
customCfgPath = ""
customCfgPath = args.__dict__["configuration"]
cfgFile = open(presetCfgPath)
presetCfgData = json.load(cfgFile)
cfgFile.close()
cfgFile = open(customCfgPath)
customCfgData = json.load(cfgFile)
cfgFile.close()
# customize cfg
for key in customCfgData:
newVal = customCfgData[key]
presetCfgData[key] = newVal
presetCfgData = absolutizePaths(presetCfgData)
return args, presetCfgData, customCfgPath
def getBlobDiff(file1, file2):
with open(file1) as file:
content = file.readlines()
with open(file2) as sampleFile:
sampleContent = sampleFile.readlines()
# ignore first line with memory address
i = -1
curMaxDiff = 0
for sampleLine in sampleContent:
i = i + 1
if i >= len(sampleContent):
break
line = content[i]
sampleVal = 0
val = 0
try:
sampleVal = float(sampleLine)
val = float(line)
except ValueError:
continue
if val != sampleVal:
curMaxDiff = max(curMaxDiff, abs(val - sampleVal))
return curMaxDiff
def absolutizePaths(cfg):
pathToAbsolutize = ["gitPath", "buildPath", "appPath", "workPath"]
for item in pathToAbsolutize:
path = cfg[item]
path = os.path.abspath(path)
cfg[item] = path
if "preprocess" in cfg["runConfig"]:
prepFile = cfg["runConfig"]["preprocess"]["file"]
prepFile = os.path.abspath(prepFile)
cfg["runConfig"]["preprocommArgcess"]["file"] = prepFile
return cfg
def checkArgAndGetCommits(commArg, cfgData):
# WA because of python bug with
# re.search("^[a-zA-Z0-9]+\.\.[a-zA-Z0-9]+$", commArg)
if not len(commArg.split("..")) == 2:
raise ValueError("{arg} is not correct commit set".format(arg=commArg))
else:
getCommitSetCmd = 'git log {interval} --boundary --pretty="%h"'.format(
interval=commArg
)
proc = subprocess.Popen(
getCommitSetCmd.split(),
cwd=cfgData["gitPath"],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
)
proc.wait()
out, err = proc.communicate()
out = out.decode("utf-8")
outList = out.split()
if re.search(".*fatal.*", out):
print(out)
raise ValueError("{arg} commit set is invalid".format(arg=commArg))
elif len(outList) == 0:
raise ValueError("{arg} commit set is empty".format(arg=commArg))
else:
return outList
def runCommandList(commit, cfgData, enforceClean=False):
skipCleanInterval = False
if "trySkipClean" not in cfgData:
skipCleanInterval = not enforceClean
else:
skipCleanInterval = cfgData["trySkipClean"] and not enforceClean
commitLogger = getCommitLogger(cfgData, commit)
commandList = cfgData["commandList"]
gitPath = cfgData["gitPath"]
buildPath = cfgData["buildPath"]
defRepo = gitPath
for cmd in commandList:
if "tag" in cmd:
if cmd["tag"] == "clean" and skipCleanInterval:
continue
elif cmd["tag"] == "preprocess":
if not (
"preprocess" in cfgData["runConfig"]
and "name" in cfgData["runConfig"]["preprocess"]
):
raise CfgError("No preprocess provided")
prePrName = cfgData["runConfig"]["preprocess"]["name"]
mod = importlib.import_module(
"utils.preprocess.{pp}".format(pp=prePrName)
)
preProcess = getattr(mod, prePrName)
preProcess(cfgData)
continue
makeCmd = cfgData["makeCmd"]
strCommand = cmd["cmd"].format(commit=commit, makeCmd=makeCmd)
formattedCmd = strCommand.split()
cwd = defRepo
if "path" in cmd:
cwd = cmd["path"].format(buildPath=buildPath, gitPath=gitPath)
commitLogger.info("Run command: {command}".format(
command=formattedCmd)
)
proc = subprocess.Popen(
formattedCmd, cwd=cwd, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT
)
for line in proc.stdout:
# decode if line is byte-type
try:
line = line.decode("utf-8")
except (UnicodeDecodeError, AttributeError):
pass
sys.stdout.write(line)
commitLogger.info(line)
proc.wait()
checkOut, err = proc.communicate()
try:
checkOut = checkOut.decode("utf-8")
except (UnicodeDecodeError, AttributeError):
pass
if "catchMsg" in cmd:
isErrFound = re.search(cmd["catchMsg"], checkOut)
if isErrFound:
if skipCleanInterval:
commitLogger.info("Build error: clean is necessary")
raise NoCleanFailedError()
else:
raise CmdError(checkOut)
def fetchAppOutput(cfg):
newEnv = os.environ.copy()
if "envVars" in cfg:
for env in cfg["envVars"]:
envKey = env["name"]
envVal = env["val"]
newEnv[envKey] = envVal
appCmd = cfg["appCmd"]
appPath = cfg["appPath"]
p = subprocess.Popen(
appCmd.split(),
cwd=appPath,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
env=newEnv,
)
output, err = p.communicate()
output = output.decode("utf-8")
return output
def handleCommit(commit, cfgData):
if "skipCleanInterval" in cfgData["serviceConfig"]:
skipCleanInterval = cfgData["serviceConfig"]["skipCleanInterval"]
cfgData["trySkipClean"] = skipCleanInterval
try:
runCommandList(commit, cfgData)
except (NoCleanFailedError):
cfgData["trySkipClean"] = False
runCommandList(commit, cfgData)
def returnToActualVersion(cfg):
cmd = cfg["returnCmd"]
cwd = cfg["gitPath"]
proc = subprocess.Popen(
cmd.split(), cwd=cwd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT
)
proc.wait()
return
def setupLogger(name, logPath, logFileName, level=log.INFO):
if not os.path.exists(logPath):
os.makedirs(logPath)
logFileName = logPath + logFileName
open(logFileName, "w").close() # clear old log
handler = log.FileHandler(logFileName)
formatter = log.Formatter("%(asctime)s %(levelname)s %(message)s")
handler.setFormatter(formatter)
logger = log.getLogger(name)
logger.setLevel(level)
logger.addHandler(handler)
return logger
def getCommitLogger(cfg, commit):
logName = "commitLogger_{c}".format(c=commit)
if log.getLogger(logName).hasHandlers():
return log.getLogger(logName)
logPath = getActualPath("logPath", cfg)
logFileName = "commit_{c}.log".format(c=commit)
commitLogger = setupLogger(logName, logPath, logFileName)
return commitLogger
def getActualPath(pathName, cfg):
workPath = cfg["workPath"]
curPath = cfg[pathName]
return curPath.format(workPath=workPath)
def safeClearDir(path):
if not os.path.exists(path):
os.makedirs(path)
p = subprocess.Popen(
"rm -rf *", cwd=path,
stdout=subprocess.PIPE, shell=True
)
p.wait()
return
class CfgError(Exception):
pass
class CashError(Exception):
pass
class CmdError(Exception):
pass
class NoCleanFailedError(Exception):
pass
class RepoError(Exception):
pass
def checkAndGetClassnameByConfig(cfg, mapName, specialCfg):
keyName = cfg["runConfig"][specialCfg]
map = cfg[mapName]
if not (keyName in map):
raise CfgError(
"{keyName} is not registered in {mapName}".format(
keyName=keyName, mapName=mapName
)
)
else:
return map[keyName]
def checkAndGetSubclass(clName, parentClass):
cl = [cl for cl in parentClass.__subclasses__() if cl.__name__ == clName]
if not (cl.__len__() == 1):
raise CfgError("Class {clName} doesn't exist".format(clName=clName))
else:
return cl[0]

View File

@ -0,0 +1,256 @@
import os
from utils.helpers import fetchAppOutput, getActualPath
from utils.helpers import getMeaningfullCommitTail
from utils.helpers import handleCommit, runCommandList, getBlobDiff
from utils.helpers import getCommitLogger, CashError, CfgError, CmdError
import re
import shutil
from utils.common_mode import Mode
class CheckOutputMode(Mode):
def __init__(self, cfg):
super().__init__(cfg)
self.createCash()
def checkCfg(self, cfg):
super().checkCfg(cfg)
if not ("stopPattern" in cfg["runConfig"]):
raise CfgError("stopPattern is not configured")
def checkIfBordersDiffer(self, i1, i2, list, cfg):
isLeftBorderFailed = False
if i1 != 0 or cfg["checkIfBordersDiffer"]:
isLeftBorderFailed = self.isBadVersion(list[i1], cfg)
isRightBorderGood = not self.isBadVersion(list[i2], cfg)
rightCommit = list[i2]
rightCommit = rightCommit.replace('"', "")
commitLogger = getCommitLogger(cfg, rightCommit)
commitLogger.info(
"Commit {c} is {status}".format(
status=("good" if isRightBorderGood else "bad"),
c=list[i2])
)
return isLeftBorderFailed == isRightBorderGood
def isBadVersion(self, commit, cfg):
commit = commit.replace('"', "")
checkOut = ""
commitLogger = getCommitLogger(cfg, commit)
isCommitCashed, cashedOutput = self.getCommitIfCashed(commit)
if isCommitCashed:
logMsg = "Cashed commit - {commit}".format(commit=commit)
self.commonLogger.info(logMsg)
commitLogger.info(logMsg)
checkOut = cashedOutput
else:
self.commonLogger.info("New commit: {commit}".format(
commit=commit)
)
handleCommit(commit, cfg)
checkOut = fetchAppOutput(cfg)
commitLogger.info(checkOut)
self.setCommitCash(commit, checkOut)
stopPattern = cfg["runConfig"]["stopPattern"]
isFound = re.search(stopPattern, checkOut)
return isFound
class BenchmarkAppPerformanceMode(Mode):
def __init__(self, cfg):
super().__init__(cfg)
self.outPattern = "Throughput:\s*([0-9]*[.][0-9]*)\s*FPS"
self.perfRel = 0
self.createCash()
def prepareRun(self, i1, i2, list, cfg):
super().prepareRun(i1, i2, list, cfg)
sampleCommit = list[i1]
sampleCommit = sampleCommit.replace('"', "")
self.commonLogger.info(
"Prepare sample commit - {commit}".format(commit=sampleCommit)
)
commitLogger = getCommitLogger(cfg, sampleCommit)
foundThroughput = 0
isCommitCashed, cashedThroughput = self.getCommitIfCashed(sampleCommit)
if isCommitCashed:
logMsg = "Cashed commit - {commit}".format(commit=sampleCommit)
self.commonLogger.info(logMsg)
commitLogger.info(logMsg)
foundThroughput = cashedThroughput
else:
runCommandList(sampleCommit, cfg, enforceClean=True)
output = fetchAppOutput(cfg)
commitLogger.info(output)
foundThroughput = re.search(
self.outPattern, output, flags=re.MULTILINE
).group(1)
self.setCommitCash(sampleCommit, float(foundThroughput))
self.sampleThroughput = float(foundThroughput)
def checkCfg(self, cfg):
super().checkCfg(cfg)
if not ("perfAppropriateDeviation" in cfg["runConfig"]):
raise CfgError("Appropriate deviation is not configured")
else:
self.apprDev = cfg["runConfig"]["perfAppropriateDeviation"]
def checkIfBordersDiffer(self, i1, i2, list, cfg):
leftThroughput = self.getThroughputByCommit(list[i1], cfg)
rightCommit = list[i2]
rightThroughput = self.getThroughputByCommit(rightCommit, cfg)
curRel = rightThroughput / leftThroughput
isBad = not ((1 - curRel) < self.apprDev)
if isBad:
self.perfRel = curRel
rightCommit = rightCommit.replace('"', "")
commitLogger = getCommitLogger(cfg, rightCommit)
commitLogger.info("Performance relation is {rel}".format(rel=curRel))
commitLogger.info(
"Commit is {status}".format(status=("bad" if isBad else "good"))
)
return isBad
def getThroughputByCommit(self, commit, cfg):
commit = commit.replace('"', "")
curThroughput = 0
commitLogger = getCommitLogger(cfg, commit)
isCommitCashed, cashedThroughput = self.getCommitIfCashed(commit)
if isCommitCashed:
logMsg = "Cashed commit - {commit}".format(commit=commit)
self.commonLogger.info(logMsg)
commitLogger.info(logMsg)
curThroughput = cashedThroughput
else:
self.commonLogger.info("New commit: {commit}".format(
commit=commit)
)
handleCommit(commit, cfg)
output = fetchAppOutput(cfg)
foundThroughput = re.search(
self.outPattern, output, flags=re.MULTILINE
).group(1)
curThroughput = float(foundThroughput)
commitLogger.info(output)
self.setCommitCash(commit, curThroughput)
return curThroughput
def setOutputInfo(self, pathCommit):
pathCommit.perfRel = self.perfRel
def getResult(self):
for pathCommit in self.commitPath.getList():
print("Break commit: {c}, perf. ratio = {d}".format(
c=self.commitList[pathCommit.id],
d=pathCommit.perfRel)
)
class CompareBlobsMode(Mode):
def __init__(self, cfg):
super().__init__(cfg)
self.createCash()
self.maxDiff = 0
def getOutNameByCommit(self, commit, cfg):
commit = commit.replace('"', "")
commitLogger = getCommitLogger(cfg, commit)
filename = ''
isCommitCashed, cachedfileName = self.getCommitIfCashed(commit)
if isCommitCashed:
logMsg = "Cashed commit - {commit}".format(commit=commit)
self.commonLogger.info(logMsg)
commitLogger.info(logMsg)
filename = cachedfileName
else:
self.commonLogger.info("New commit: {commit}".format(
commit=commit)
)
runCommandList(commit, cfg, enforceClean=True)
output = fetchAppOutput(cfg)
commitLogger.info(output)
filename = self.setCommitCash(commit, None)
return filename
def checkIfBordersDiffer(self, i1, i2, list, cfg):
leftBorderOutputName = self.getOutNameByCommit(list[i1], cfg)
rightBorderOutputName = self.getOutNameByCommit(list[i2], cfg)
fullLeftFileName = os.path.join(self.cachePath, leftBorderOutputName)
fullRightName = os.path.join(self.cachePath, rightBorderOutputName)
curMaxDiff = getBlobDiff(fullLeftFileName, fullRightName)
isDiff = True if curMaxDiff > self.limit else False
rightCommit = list[i2]
rightCommit = rightCommit.replace('"', "")
commitLogger = getCommitLogger(cfg, rightCommit)
commitLogger.info(
"Commit {status} from {c}".format(
status=("differs" if isDiff else "doesn't differ"),
c=list[i2])
)
if isDiff:
self.maxDiff = curMaxDiff
commitLogger.info("Absolute difference is {d}".format(d=curMaxDiff))
return isDiff
def checkCfg(self, cfg):
super().checkCfg(cfg)
if not ("outputFileNamePattern" in cfg["runConfig"]):
raise CfgError("Output pattern is not configured")
elif not ("outputDirectory" in cfg["runConfig"]):
raise CfgError("Output directory pattern is not configured")
else:
self.outFileNamePattern = cfg["runConfig"]["outputFileNamePattern"]
self.outDir = os.path.abspath(cfg["runConfig"]["outputDirectory"])
if "limit" in cfg["runConfig"]:
self.limit = float(cfg["runConfig"]["limit"])
else:
self.limit = 0
def setCommitCash(self, commit, valueToCache):
isCommitCashed, _ = self.getCommitIfCashed(commit)
newFileName = ""
if isCommitCashed:
raise CashError("Commit already cashed")
else:
fileList = os.listdir(self.outDir)
# we look for just created output file
for filename in fileList:
isDump = re.search(self.outFileNamePattern, filename)
if isDump:
newFileName = "{c}_{fn}".format(
c=getMeaningfullCommitTail(commit), fn=filename
)
shutil.move(
os.path.join(self.outDir, filename),
os.path.join(self.cachePath, newFileName)
)
break
if filename == "":
raise CmdError("Output file not found")
return newFileName
def createCash(self):
# we use separate files instead of json cache,
# so, we just set up path to cache folder
self.cachePath = getActualPath("cachePath", self.cfg)
pass
def getCommitIfCashed(self, commit):
fileList = os.listdir(self.cachePath)
curCommitPattern = "{c}_(.)*".format(c=getMeaningfullCommitTail(commit))
for filename in fileList:
isDump = re.search(curCommitPattern, filename)
if isDump:
return True, filename
return False, None
def setOutputInfo(self, pathCommit):
pathCommit.diff = self.maxDiff
def getResult(self):
for pathcommit in self.commitPath.getList():
print("Break commit: {c}, diff = {d}".format(
c=self.commitList[pathcommit.id],
d=pathcommit.diff)
)

View File

@ -0,0 +1,12 @@
import re
import fileinput
def replace(cfg):
prepCfg = cfg["runConfig"]["preprocess"]
filePath = prepCfg["file"]
pattern = prepCfg["pattern"]
replacement = ''
for line in fileinput.input(filePath, inplace=True):
newLine = re.sub(pattern, r'{}'.format(replacement), line, flags=0)
print(newLine, end='')