Re-add curly braces in author names after latex parsing#293
Conversation
…dlingInAuthor * upstream/master: Add missing item types and variables from the CSL 1.0.2 specification
|
@michel-kraemer I finally found time to fix the failing tests as well to exclude inner formatting braces. |
|
Thank you very much for raising this issue and for providing the pull request! Also, apologies for my late reply. As I said, I was out of office and didn't have access to a computer. The PR works very well! I reviewed it carefully and have one final question before I merge it. If I understand it correctly, this is a workaround for a bug in JBibTex, right? I was wondering because I discovered that the following test fails: @Test
public void curlyBracesAreReaddedOnlyForSecondAuthor() throws ParseException {
String entry = "@online{testcitationkey,\n" +
" author = {Foo Bar and {Foo Bar}},\n" +
" journal = {Test journal},\n" +
" title = {Test title},\n" +
" year = {2025},\n" +
"}";
BibTeXDatabase db = new BibTeXParser().parse(new StringReader(entry));
BibTeXConverter converter = new BibTeXConverter();
Map<String, CSLItemData> items = converter.toItemData(db);
CSLItemData item = items.get("testcitationkey");
CSLName name1 = new CSLNameBuilder()
.family("Bar")
.given("Foo")
.build();
CSLName name2 = new CSLNameBuilder()
.literal("Foo Bar")
.build();
assertEquals(name1, item.getAuthor()[0]); // <- FAILS HERE
assertEquals(name2, item.getAuthor()[1]);
}The code that re-adds the curly braces doesn't distinguish between the first and the second occurrence of "Foo Bar", so both will be wrapped. I don't think this is a huge problem as the test case I constructed is artificial and most likely doesn't happen in practice. I was just wondering if it wouldn't be better to fix the actual bug in JBibTex (or maybe add a flag to change its behavior). Since you have already put more thought into this as I did, I wanted to ask what your opinion is. |
|
Ah! I think I get it now. I just added a List<LaTeXObject> objs = latexParser.parse(new StringReader(us));
System.out.println(objs);
us = latexPrinter.print(objs).replaceAll("\\n", " ").replaceAll("\\r", "").trim();And got the following output: What if we looked for |
|
The following seems to work with all your test cases and mine: for (Map.Entry<Key, Value> field : e.getFields().entrySet()) {
String us = field.getValue().toUserString().replaceAll("\\r", "");
// convert LaTeX string to normal text
try {
List<LaTeXObject> objs = latexParser.parse(new StringReader(us));
List<LaTeXObject> newObjs;
String keyLower = field.getKey().getValue().toLowerCase();
if (FIELD_AUTHOR.equals(keyLower) || FIELD_EDITOR.equals(keyLower)) {
newObjs = new ArrayList<>();
for (LaTeXObject o : objs) {
if (o instanceof LaTeXGroup) {
List<LaTeXObject> children = new ArrayList<>();
children.add(new LaTeXString("{"));
children.addAll(((LaTeXGroup)o).getObjects());
children.add(new LaTeXString("}"));
LaTeXGroup g = new LaTeXGroup(children);
newObjs.add(g);
} else {
newObjs.add(o);
}
}
} else {
newObjs = objs;
}
us = latexPrinter.print(newObjs).replaceAll("\\n", " ").replaceAll("\\r", "").trim();
} catch (ParseException | TokenMgrException ex) {
// ignore
}
entries.put(field.getKey().getValue().toLowerCase(), us);
} |
|
@michel-kraemer Thanks for the update. Your code looks even simpler and I agree more future proof. Actually, this is not really a bug in JBibtex, the behavior is correct, as it is a LatexParser creates two Latex objects out of it Feel free to push your changes to this branch |
|
@michel-kraemer I tested your approach now and It fails in the fixture tests (had this initially as well) with diacritics in curly braces I came up with a better solution, check for latex commands inside the latex objects |
|
@Siedlerchr Thanks for your effort! 👍 |
Fixes #292
The reason is that LatexParser in jbibtex uses the curly braces for group indication in the grammar and does not treat it as single item nor preserves them