Skip to content

HDDS-11072. Publish user-facing configs to the doc site #6916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Build - xmltomd - push the changes to ozone-site
  • Loading branch information
sarvekshayr committed Aug 14, 2024
commit 180adc2b5c7ac26796af47f81df7cf0bf97c0e91
112 changes: 101 additions & 11 deletions .github/workflows/doc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,85 @@ name: xml-to-md
on:
push:
branches:
- HDDS-11072
- master

jobs:
build:
runs-on: ubuntu-20.04
timeout-minutes: 60
strategy:
matrix:
java: [ 8 ]
fail-fast: false
steps:
- name: Checkout project
uses: actions/checkout@v4

- name: Cache for npm dependencies
uses: actions/cache@v4
with:
path: |
~/.pnpm-store
**/node_modules
key: ${{ runner.os }}-pnpm-${{ hashFiles('**/pnpm-lock.yaml') }}
restore-keys: |
${{ runner.os }}-pnpm-

- name: Cache for maven dependencies
uses: actions/cache/restore@v4
with:
path: |
~/.m2/repository/*/*/*
!~/.m2/repository/org/apache/ozone
key: maven-repo-${{ hashFiles('**/pom.xml') }}
restore-keys: |
maven-repo-
- name: Download Ratis repo
if: ${{ inputs.ratis_args != '' }}
uses: actions/download-artifact@v4
with:
name: ratis-jars
path: |
~/.m2/repository/org/apache/ratis
- name: Setup java
uses: actions/setup-java@v4
with:
distribution: 'temurin'
java-version: ${{ matrix.java }}
- name: Run a full build
run: hadoop-ozone/dev-support/checks/build.sh -Pdist -Psrc -Dmaven.javadoc.skip=true ${{ inputs.ratis_args }}
env:
DEVELOCITY_ACCESS_KEY: ${{ secrets.GE_ACCESS_TOKEN }}
- name: Store binaries for tests
uses: actions/upload-artifact@v4
with:
name: ozone-bin
path: |
hadoop-ozone/dist/target/ozone-*.tar.gz
!hadoop-ozone/dist/target/ozone-*-src.tar.gz
retention-days: 1
- name: Store source tarball for compilation
uses: actions/upload-artifact@v4
with:
name: ozone-src
path: hadoop-ozone/dist/target/ozone-*-src.tar.gz
retention-days: 1
- name: Store Maven repo for tests
uses: actions/upload-artifact@v4
with:
name: ozone-repo
path: |
~/.m2/repository/org/apache/ozone
retention-days: 1

xml-to-md:
needs:
- build
runs-on: ubuntu-20.04

steps:
- name: Checkout repository
uses: actions/checkout@v3
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v4
Expand All @@ -35,16 +105,36 @@ jobs:
- name: Install dependencies
run: sudo apt-get update && sudo apt-get install -y python3-pip

- name: Download the source artifact
uses: actions/download-artifact@v4
with:
name: ozone-bin
path: ozone-bin

- name: Extract the source tarball
run: |
mkdir -p ozone-bin/extracted
tar -xzf ozone-bin/ozone-*-SNAPSHOT.tar.gz -C ozone-bin/extracted

- name: Run the Python script to convert XML properties into Markdown
run: python3 dev_support/ci/xml_to_md.py
run: python3 dev-support/ci/xml_to_md.py

- name: Upload the markdown file
uses: actions/upload-artifact@v3
- name: Push changes to ozone-site
uses: actions/checkout@v4
with:
name: Configurations.md
path: hadoop-hdds/docs/content/tools/Configurations.md
repository: apache/ozone-site
ref: 'HDDS-9225-website-v2'
token: ${{ secrets.GITHUB_TOKEN }}
path: ozone-site

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar hash check step needs to be aded here to check if the config file was outdated i.e some new config was added in some commit

- name: Download the markdown file
uses: actions/download-artifact@v3
with:
name: Configurations.md
- name: Copy MD file
run: |
cp hadoop-hdds/docs/content/tools/Configurations.md ozone-site/docs/03-core-concepts/06-configurations.md

- name: Commit changes
run: |
git config --global user.name 'Github Actions'
git config --global user.email '[email protected]'
git add ozone-site/docs/03-core-concepts/06-configurations.md
git commit -m "[Auto] Push MD file to Ozone site"
git push
71 changes: 32 additions & 39 deletions dev-support/ci/xml_to_md.py
Original file line number Diff line number Diff line change
@@ -1,38 +1,39 @@
import os
import re
import zipfile
import xml.etree.ElementTree as ET
from collections import namedtuple
from pathlib import Path

# Define the Property namedtuple to hold property details
Property = namedtuple('Property', ['name', 'value', 'tag', 'description'])

def parse_xml_file(file_path):
"""
Parse the given XML file and extract properties.
def extract_xml_from_jar(jar_path, xml_filename):
xml_files = []
with zipfile.ZipFile(jar_path, 'r') as jar:
for file_info in jar.infolist():
if file_info.filename.endswith(xml_filename):
with jar.open(file_info.filename) as xml_file:
xml_files.append(xml_file.read())
return xml_files

:param file_path: Path to the XML file
:return: Dictionary of properties with property names as keys
"""
tree = ET.parse(file_path)
root = tree.getroot()
def parse_xml_file(xml_content):
root = ET.fromstring(xml_content)
properties = {}
for prop in root.findall('property'):
name = prop.find('name').text if prop.find('name') is not None else ''
value = prop.find('value').text if prop.find('value') is not None else ''
tag = prop.find('tag').text if prop.find('tag') is not None else ''
description = prop.find('description').text if prop.find('description') is not None else ''
description = ' '.join(description.split()).strip() # Clean up whitespace
description = ' '.join(description.split()).strip() if description else ''
properties[name] = Property(name, value, tag, description)
return properties

def generate_markdown(properties):
"""
Generate Markdown content from properties.

:param properties: Dictionary of properties
:return: Markdown string
"""
markdown = []
markdown.append("---\n")
markdown.append("title: \"Ozone configurations\"\n")
markdown.append("summary: Ozone configurations\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current website doesn't require front matter. The new website will fail the build if disallowed front matter keys are used like title and summary. I would change this to use sidebar_label only if the label in the docs sidebar should be different than the page title given by the first heading. sidebar_laebl is only used by the new website. The current website will ignore this key.

markdown.append("---\n")
markdown.append("<!--\n")
markdown.append("Licensed to the Apache Software Foundation (ASF) under one or more\n")
markdown.append("contributor license agreements. See the NOTICE file distributed with\n")
Expand All @@ -59,34 +60,26 @@ def generate_markdown(properties):
return ''.join(markdown)

def main():
"""
Main function to parse XML files and generate Markdown documentation.
"""
xml_files = [
"hadoop-hdds/client/target/classes/ozone-default-generated.xml",
"hadoop-hdds/common/target/classes/ozone-default-generated.xml",
"hadoop-hdds/container-service/target/classes/ozone-default-generated.xml",
"hadoop-hdds/framework/target/classes/ozone-default-generated.xml",
"hadoop-hdds/server-scm/target/classes/ozone-default-generated.xml",
"hadoop-ozone/common/target/classes/ozone-default-generated.xml",
"hadoop-ozone/csi/target/classes/ozone-default-generated.xml",
"hadoop-ozone/ozone-manager/target/classes/ozone-default-generated.xml",
"hadoop-ozone/recon-codegen/target/classes/ozone-default-generated.xml",
"hadoop-ozone/recon/target/classes/ozone-default-generated.xml",
]
base_path = 'ozone-bin/extracted'

property_map = {}
# Find ozone SNAPSHOT directory dynamically using regex
snapshot_dir = next(
(os.path.join(base_path, d) for d in os.listdir(base_path) if re.match(r'ozone-.*-SNAPSHOT', d)),
None
)

for xml_file in xml_files:
if not os.path.exists(xml_file):
print(f"File not found: {xml_file}")
continue
extract_path = os.path.join(snapshot_dir, 'share', 'ozone', 'lib')
xml_filename = 'ozone-default-generated.xml'

properties = parse_xml_file(xml_file)
property_map.update(properties)
property_map = {}
for file_name in os.listdir(extract_path):
if file_name.endswith('.jar'):
jar_path = os.path.join(extract_path, file_name)
xml_contents = extract_xml_from_jar(jar_path, xml_filename)
for xml_content in xml_contents:
property_map.update(parse_xml_file(xml_content))

markdown_content = generate_markdown(property_map)

output_path = Path("hadoop-hdds/docs/content/tools/Configurations.md")
output_path.parent.mkdir(parents=True, exist_ok=True)

Expand Down