-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Search before asking
- I had searched in the issues and found no similar feature requirement.
Problem Description
Due to the user generating new contextIds during the publishing process, the old contextIds are not cleared, resulting in a large amount of redundant data in the CS table and affecting query efficiency.
Description
- Linkis has provided a batch cleaning plan;
- The dss configuration timer task calls the batchClearContextId method and cleans it regularly every morning.
Use case
No response
solutions
Clear sequence diagram for dss contextId:

Description: The dss configuration timer task calls the batchClearContextId method and cleans it regularly every morning. (Set configurable parameters wds. dss. server. scheduling. clear. cs. cron=0 0 3 * *?)
1.1. First, check the DSS_ Orchestrator_ Version_ Info table, queried through reserved versions. The configuration parameters for the reserved version are:
wds.dss.publish.max.remain.version=30
SELECT
b.orchestrator_id ,
b.context_id ,
b.app_id,
b.id
FROM
dss_release_task a
LEFT JOIN dss_orchestrator_version_info b ON
a.orchestrator_version_id = b.id
left join dss_orchestrator_info c on
c.id = b.orchestrator_id
left join (
select
orchestrator_id, max(version) as maxVersion
from
dss_orchestrator_version_info
group by
orchestrator_id
HAVING
substring(max(version),2) > #{remainVersion}) tmp on
b.orchestrator_id = tmp.orchestrator_id
where
a.status = 'Success'
and substring(b.version,2) + #{remainVersion} < substring(tmp.maxVersion,2)
and b.context_id != ''
order by a.id desc
1.2. The identified data is a useless historical version. First, delete the useless version data to prevent subsequent calls to the linkis batch deletion method from failing and causing data errors.
1.3. RPC requests the workflow service to query the contextId of the sub workflow
1.5.1. Firstly, it will determine whether there is a sub workflow. If not, it will return an empty set, and if there is, it will return the sub workflow ID
1.5.2. If it exists, obtain the workflow JSON through the getFlow method, parse it to obtain the sub workflow contextId, and return it.
1.6. After obtaining all the contexts to be cleaned, call the linkis batch cleaning method batchClearContextByHAID to clean 1000 contexts in batches.
1.7. After cleaning the contextId, clean up the old BML file.
Anything else
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!