Skip to content
This repository was archived by the owner on Mar 26, 2026. It is now read-only.

feat: add summary page for Classes and Modules#361

Merged
dandhlee merged 4 commits intomainfrom
add_summary_page
Mar 29, 2024
Merged

feat: add summary page for Classes and Modules#361
dandhlee merged 4 commits intomainfrom
add_summary_page

Conversation

@dandhlee
Copy link
Copy Markdown
Collaborator

Curate summary pages given the entirety of the library content. Starting with Classes / Modules which are similar, then will expand to other types (Methods & Functions, and Properties & Attributes) in a followup.

Goes through all of the content, then extracts the necessary information, creating a separate summary_class.yml file. Adding as a separate top level entry. See live example: https://cloud.google.com/python/docs/reference/bigframes/latest/summary_class

Confirmed locally that the same page still gets produced, which was based off of #298, this PR just makes the code a bit more production quality.

summary_overview page is referenced but will have templates that live in the client libraries. See https://docs.google.com/document/d/1FmAOv9ald2W2Set8jPzN-gwQoE992LCWSu40KBVON28/edit?resourcekey=0-JLhux0549oJustMl46l3zA&tab=t.0 for design.

Test cases would be added for goldens, however it is currently disabled.

Towards b/263399076

  • Tests pass
  • Appropriate changes to README are included in PR

@product-auto-label product-auto-label Bot added the size: m Pull request size is medium. label Mar 27, 2024
@dandhlee dandhlee marked this pull request as ready for review March 27, 2024 08:47
@dandhlee dandhlee requested review from a team and dansaadati March 27, 2024 08:47
Comment thread docfx_yaml/extension.py Outdated

file_path_to_use = os.path.join(normalized_outdir, file_name)
with open(file_path_to_use, "w") as summary_file_obj:
summary_file_obj.write("### YamlMime:UniversalReference\n")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: set a constant for the docfx YAML type string

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread docfx_yaml/extension.py
CLASS: CLASS,
}
# Construct a mapping of name and content for each unique summary type entry.
_ENTRY_NAME_AND_ENTRY_CONTENT_BY_SUMMARY_TYPE = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be converted to a dataclass? Something like:

class SummaryEntry:
  name: str
  summary: str

For now it seems like there's just one summary type CLASS, so I think this information can be captured in just one data class. This would allow us to not use i-indexing below, which I think makes this more readable.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit tricky, I need to keep track of all the entries that's been submitted already. Either we need to keep an extra list, otherwise converting this to a dataclass we'd lose being able to have a sequence of information, instead we'll have Sequence[SummaryEntry] and not be able to easily go through the info :/

Comment thread docfx_yaml/extension.py
DEPRECATED: 'deprecated',
}

_SUMMARY_TYPE_BY_ITEM_TYPE = {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Maybe use _SUMMARY_GROUP instead? The phrasing can get a little bit confusing because you're working with Python types as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a preferred style for mappings (summary_by_item[item] gives summary) - I could also just omit the types if that makes it better

Comment thread docfx_yaml/extension.py
uid = yaml_data.get("uid", "")
item_to_add = uid if summary_type == CLASS else f"{uid}-summary"

if item_to_add not in _ENTRY_NAME_AND_ENTRY_CONTENT_BY_SUMMARY_TYPE[summary_type][0]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid this check if your use a Set for your uids.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem was that I ran into the same items twice, not that the UIDs are not unique :/ I was seeing duplicate entries for each entry, but the UIDs are definitely unique (google.cloud.package.module.item)

Comment thread docfx_yaml/extension.py Outdated
continue

_ENTRY_NAME_AND_ENTRY_CONTENT_BY_SUMMARY_TYPE[summary_type][1].append(
_find_summary_details(entry, summary_type, cgc_url)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think because the return value of _find_summary_details is only used to update the global mapping, I would just append the summary detail dict within the function.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Good point.

Comment thread docfx_yaml/extension.py Outdated
],
}
)
cgc_url = (
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Move this below writing the toc file.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread docfx_yaml/extension.py
'langs': ['python'],
'type': 'package',
'summary': f'Summary of entries of {entry_name} for {library_name}.',
'children': children_name_and_summary_content[0],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: consider sorting by class/module name here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already sorted by UID!

Comment thread docfx_yaml/extension.py Outdated


def _render_summary_content(
children_name_and_summary_content: Sequence[Sequence[str]],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be mistaken here, but isn't this type Sequence[Sequence[ str | dict ] or Sequence[Sequence[ str | _yaml_type_alias ] ? The first sequence contains the list of uids, the second sequence contains the dict of YAML fields for references.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Done.

Comment thread docfx_yaml/extension.py
'langs': ['python'],
'type': 'package',
'summary': f'Summary of entries of {entry_name} for {library_name}.',
'children': children_name_and_summary_content[0],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this need to be converted into a YAML list?
Something like:

reduce(lambda x, y:  y + '\n - '  + x , children_name_and_summary_content[0])

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that gets handled already!

Copy link
Copy Markdown
Collaborator Author

@dandhlee dandhlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that the changes did not affect produced result. Please take a look again!

Comment thread docfx_yaml/extension.py Outdated

file_path_to_use = os.path.join(normalized_outdir, file_name)
with open(file_path_to_use, "w") as summary_file_obj:
summary_file_obj.write("### YamlMime:UniversalReference\n")
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread docfx_yaml/extension.py
'langs': ['python'],
'type': 'package',
'summary': f'Summary of entries of {entry_name} for {library_name}.',
'children': children_name_and_summary_content[0],
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already sorted by UID!

Comment thread docfx_yaml/extension.py Outdated


def _render_summary_content(
children_name_and_summary_content: Sequence[Sequence[str]],
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Done.

Comment thread docfx_yaml/extension.py
uid = yaml_data.get("uid", "")
item_to_add = uid if summary_type == CLASS else f"{uid}-summary"

if item_to_add not in _ENTRY_NAME_AND_ENTRY_CONTENT_BY_SUMMARY_TYPE[summary_type][0]:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem was that I ran into the same items twice, not that the UIDs are not unique :/ I was seeing duplicate entries for each entry, but the UIDs are definitely unique (google.cloud.package.module.item)

Comment thread docfx_yaml/extension.py
CLASS: CLASS,
}
# Construct a mapping of name and content for each unique summary type entry.
_ENTRY_NAME_AND_ENTRY_CONTENT_BY_SUMMARY_TYPE = {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit tricky, I need to keep track of all the entries that's been submitted already. Either we need to keep an extra list, otherwise converting this to a dataclass we'd lose being able to have a sequence of information, instead we'll have Sequence[SummaryEntry] and not be able to easily go through the info :/

Comment thread docfx_yaml/extension.py
DEPRECATED: 'deprecated',
}

_SUMMARY_TYPE_BY_ITEM_TYPE = {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a preferred style for mappings (summary_by_item[item] gives summary) - I could also just omit the types if that makes it better

Comment thread docfx_yaml/extension.py
'langs': ['python'],
'type': 'package',
'summary': f'Summary of entries of {entry_name} for {library_name}.',
'children': children_name_and_summary_content[0],
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that gets handled already!

Comment thread docfx_yaml/extension.py Outdated
],
}
)
cgc_url = (
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread docfx_yaml/extension.py Outdated
continue

_ENTRY_NAME_AND_ENTRY_CONTENT_BY_SUMMARY_TYPE[summary_type][1].append(
_find_summary_details(entry, summary_type, cgc_url)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Good point.

@dandhlee dandhlee requested a review from dansaadati March 29, 2024 19:18
@dandhlee dandhlee merged commit e56c3a7 into main Mar 29, 2024
@dandhlee dandhlee deleted the add_summary_page branch March 29, 2024 20:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants