-
-
Notifications
You must be signed in to change notification settings - Fork 342
Implement tidy csv output #1768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Their scope is more than just commodities.
This puts every date in a separate row, which is more suitable for many graphing programs.
|
This is currently accessed using the option Text modeWide outputTall outputBare outputCsv modeWide outputBare outputTidy outputLayout optionsAs implemented, there are four arguments that
Alternatives
Open questions
|
|
Great examples. I think all of this is in context of the balance command only ? I think we need to be careful not to let --output-format and --layout overlap and complexify too much. I think we would be wise to consolidate and build out the options we currently provide before adding more. I believe until now, only register and print reports show time on the y axis (ie, going down), while balance reports always show accounts on the y axis and time on the x axis (going across). The example of tidy output seems to be a hybrid with both accounts and time on the y axis. Why is there no tidy layout in text output, and no tall layout in csv output ? I assume considered not useful, still consistency could help reduce confusion. It would be useful to see examples of these as well. |
Yes, correct. We should probably think about whether they make sense for register and aregister too. I think they don't make sense for print.
I agree, but it's a tricky balancing act. I definitely want others' opinions on the correct interface here. I put together this proposal so we would have something concrete to work off of.
Unless you call with
Yes, a careful consideration of what to implement here is warranted. Here was my reasoning for the above.
|
|
Perhaps I should say that I think option 2 above (make tidy output accessible with |
I wonder why we even need Remind me again, why |
|
Many visualisation libraries (e.g ggplot, Vega) require tidy output for
plotting. Without it implemented in n hledger, it would need to be
transformed to tidy format with an external scripting language, e.g pandas
in python or dplyr in R.
Tall format makes a lot of sense for my use case. My accounts are primarily
in one or two commodities, but I have a large number of other commodities
that appear once or twice due to trips. If they stick around, they make the
text reports a sea of zeros which is hard to read.
|
Thanks for the "tidy" link:
Ok, great. |
--commodity-column with --layout=bare in tests.
|
Merged and documented at https://hledger.org/dev/hledger.html#data-layout. This could arguably be either an output format or a --layout mode; we chose the latter for now, making --layout a more general "data layout" option. Thanks! |
* hledger 1.28 2022-12-01
Features
- The `accounts` command has new flags: `--undeclared` (show accounts used but not declared), `--unused` (show accounts declared but not used), and `--find` (find the first account matched by the first command argument, a convenience for scripts). Also `-u` and `-d` short flags have been added for `--used` and `--declared`.
- A new CSV rule `intra-day-reversed` helps generate transactions in correct order with CSVs where records are reversed within each day.
- CSV rules can now correctly convert CSV date-times with a implicit or explicit timezone to dates in your local timezone. Previously, CSV date-times with a different time zone from yours could convert to off-by-one
dates, because the CSV's timezone was ignored.
Now,
1. When a CSV has date-times with an implicit timezone different from yours, you can use the `timezone` rule to declare it.
2. CSV date-times with a known timezone (either declared by `timezone` or parsed with `%Z`) will be localised to the system timezone
(or to the timezone set with the `TZ` environment variable).
(#1936)
Improvements
- print --match now respects -o and -O.
- print --match now returns a non-zero exit code when there is no acceptable match.
- Support megaparsec 9.3. (Felix Yan)
- Support GHC 9.4.
Fixes
- In CSV rules, when assigning a parenthesised account name to `accountN`, extra whitespace is now ignored, allowing unbalanced postings to be detected correctly.
Scripts/addons
- bin/hledger-move helps record transfers involving subaccounts and costs,
eg when withdrawing some or all of an investment balance containing many lots and costs.
- bin/hledger-git no longer uses the non-existent git record command.
(#1942) (Patrick Fiaux)
- bin/watchaccounts is a small shell script for watching the account tree as you make changes.
* hledger 1.27.1 2022-09-18
Fixes
- Balance commands using `-T -O html` no longer fail with an error
when there is no data to report.
(#1933)
* hledger 1.27 2022-09-01
Features
- `hledger check recentassertions` (and flycheck-hledger in Emacs if
you enable this check) requires that all balance-asserted accounts
have a balance assertion within 7 days before their latest posting.
This helps remind you to not only record transactions, but also to
regularly check account balances against the real world, to catch
errors sooner and avoid a time-consuming hunt.
- The --infer-costs general flag has been added, as the inverse
operation to --infer-equity. --infer-costs detects commodity
conversion transactions which have been written with equity
conversion postings (the traditional accounting notation) and adds
PTA cost notation (@@) to them (allowing cost reporting).
See https://hledger.org/hledger.html#equity-conversion-postings .
(Stephen Morgan)
Improvements
- Many error messages have been improved. Most error messages now use
a consistent, more informative format.
(#1436)
- The accounts command has a new --directives flag which makes it
show valid account directives which you can paste into a journal.
- The accounts command has a new --positions flag which shows where
accounts were declared, useful for troubleshooting.
(#1909)
- Bump lower bounds for Diff and githash. (Andrew Lelechenko)
- GHC 8.6 and 8.8 are no longer supported. Building hledger now
requires GHC 8.10 or greater.
Fixes
- Account display order is now calculated correctly even when accounts
are declared in multiple files.
(#1909)
- At --debug 5 and up, account declarations info is logged.
(#1909)
- hledger aregister and hledger-ui now show transactions correctly
when there is a type: query.
(#1905)
- bal: Allow cumulative gain and valuechange reports.
Previously, --cumulative with --gain or --valuechange would produce an
empty report. This fixes this issue to produce a reasonable report.
(Stephen Morgan)
- bal: budget goal amounts now respect -c styles (fixes #1907)
- bal: budget goals now respect -H (#1879)
- bal: budget goals were ignoring rule-specified start date
- cf/bs/is: Fixed non-display of child accounts when there is an
intervening account of another type.
(#1921) (Stephen Morgan)
- roi: make sure empty cashflows are skipped when determining first cashflow (Charlotte Van Petegem)
Empty cashflows are added when the begin date of the report is before the first
transaction.
Scripts/addons
- https://hledger.org/scripts.html - an overview of scripts and addons in bin/.
- paypaljson, paypaljson2csv - download txns from paypal API
- hledger-check-postable.hs - check that no postings are made to accounts with a postable:(n|no) tag
- hledger-addon-example.hs - script template
* hledger 1.26.1 2022-07-11
- require safe 0.3.19+ to avoid deprecation warning
* hledger 1.26 2022-06-04
Improvements
- `register` and `aregister` have been made faster, by
- considering only the first 1000 items for choosing column
widths. You can restore the old behaviour (guaranteed alignment
across all items) with the new `--align-all` flag.
([#1839](simonmichael/hledger#1839), Stephen Morgan)
- discarding cost data more aggressively, giving big speedups for
large journals with many costs.
([#1828](simonmichael/hledger#1828), Stephen Morgan)
- Most error messages from the journal reader and the `check` command now use
a consistent layout, with an "Error:" prefix, line and column numbers,
and an excerpt highlighting the problem. Work in progress.
([#1436](simonmichael/hledger#1436)) (Simon Michael, Stephen Morgan)
- `hledger check ordereddates` now always checks all transactions
(previously it could be restricted by query arguments).
- The `--pivot` option now supports a `status` argument, to pivot on transaction status.
- Update bash completions (Jakob Schöttl)
Fixes
- Value reports with `--date2` and a report interval (like `hledger bal -VM --date2`)
were failing with a "expected all spans to have an end date" error since 1.22;
this is now fixed.
([#1851](simonmichael/hledger#1851), Stephen Morgan)
- In CSV rules, interpolation of a non-existent field like `%999` or `%nosuchfield`
is now ignored (previously it inserted that literal text).
Note this means such an error will not be reported;
Simon chose this as the more convenient behaviour when converting CSV.
Experimental.
([#1803](simonmichael/hledger#1803), [#1814](simonmichael/hledger#1814)) (Stephen Morgan)
- `--infer-market-price` was inferring a negative price when selling.
([#1813](simonmichael/hledger#1813), Stephen Morgan)
- Allow an escaped forward slash in regular expression account aliases.
([#982](simonmichael/hledger#982), Stephen Morgan)
- The `tags` command now also lists tags from unused account declarations.
It also has improved command-line help layout.
([#1857](simonmichael/hledger#1857))
- `hledger accounts` now shows its debug output at a more appropriate level (4).
* hledger 1.25 2022-03-04
Breaking changes
- Journal format's `account NAME TYPECODE` syntax, deprecated in 1.13, has been dropped.
Please use `account NAME ; type:TYPECODE` instead.
(Stephen Morgan)
- The rule for auto-detecting "cash" (liquid asset) accounts in the `cashflow` report
has changed: it's now "all accounts under a top-level `asset` account, with
`cash`, `bank`, `checking` or `saving` in their name" (case insensitive, variations allowed).
So if you see a change in your `cashflow` reports, you might need to add
`account` directives with `type:C` tags, declaring your top-most cash accounts.
Features
- The new `type:TYPECODES` query matches accounts by their accounting type.
Account types are declared with a `type:` tag in account directives,
or inferred from common english account names, or inherited from parent accounts,
as described at [Declaring accounts > Account types].
This generalises the account type detection of `balancesheet`, `incomestatement` etc.,
so you can now select accounts by type without needing fragile account name regexps.
Also, the `accounts` command has a new `--types` flag to show account types.
Eg:
hledger bal type:AL # balance report showing assets and liabilities
hledger reg type:x # register of all expenses
hledger acc --types # list accounts and their types
([#1820](simonmichael/hledger#1820),
[#1822](simonmichael/hledger#1822))
(Simon Michael, Stephen Morgan)
- The `tag:` query can now also match account tags, as defined in account directives.
Subaccounts inherit tags from their parents.
Accounts, postings and transactions can be filtered by account tag.
([#1817](simonmichael/hledger#1817))
- The new `--infer-equity` flag replaces the `@`/`@@` price notation in commodity
conversion transactions with more correct equity postings (when not using `-B/--cost`).
This makes these transactions fully balanced, and preserves the accounting equation.
For example:
2000-01-01
a 1 AAA @@ 2 BBB
b -2 BBB
$ hledger print --infer-equity
2000-01-01
a 1 AAA
equity:conversion:AAA-BBB:AAA -1 AAA
equity:conversion:AAA-BBB:BBB 2 BBB
b -2 BBB
`equity:conversion` is the account used by default. To use a different account,
declare it with an account directive and the new `V` (`Conversion`) account type.
Eg:
account Equity:Trading ; type:V
([#1554](simonmichael/hledger#1554)) (Stephen Morgan, Simon Michael)
- Balance commands (`bal`, `bs` etc.) can now generate easy-to-process "tidy" CSV data
with `-O csv --layout tidy`.
In tidy data, every variable is a column and each row represents a single data point
(cf <https://vita.had.co.nz/papers/tidy-data.html>).
([#1768](simonmichael/hledger#1768),
[#1773](simonmichael/hledger#1773),
[#1775](simonmichael/hledger#1775))
(Stephen Morgan)
Improvements
- Strict mode (`-s/--strict`) now also checks periodic transactions (`--forecast`)
and auto postings (`--auto`).
([#1810](simonmichael/hledger#1810)) (Stephen Morgan)
- `hledger check commodities` now always accepts zero amounts which have no commodity symbol.
([#1767](simonmichael/hledger#1767)) (Stephen Morgan)
- Relative [smart dates](hledger.html#smart-dates) may now specify an arbitrary number of some period into the future or past).
Some examples:
- `in 5 days`
- `in -6 months`
- `5 weeks ahead`
- `2 quarters ago`
(Stephen Morgan)
- CSV output now always disables digit group marks (eg, thousands separators),
making it more machine readable by default.
([#1771](simonmichael/hledger#1771)) (Stephen Morgan)
- Unicode may now be used in field names/references in CSV rules files.
([#1809](simonmichael/hledger#1809)) (Stephen Morgan)
- Error messages improved:
- Balance assignments
- aregister
- Command line parsing (less "user error")
Fixes
- `--layout=bare` no longer shows a commodity symbol for zero amounts.
([#1789](simonmichael/hledger#1789)) (Stephen Morgan)
- `balance --budget` no longer elides boring parents of unbudgeted accounts
if they have a budget.
([#1800](simonmichael/hledger#1800)) (Stephen Morgan)
- `roi` now reports TWR correctly
- when there are several PnL changes occurring on a single day
- and also when investment is fully sold/withdrawn/discounted at the end of a particular reporting period.
([#1791](simonmichael/hledger#1791)) (Dmitry Astapov)
Documentation
- There is a new CONVERSION & COST section, replacing COSTING.
([#1554](simonmichael/hledger#1554))
- Some problematic interactions of account aliases with other features have been noted.
([#1788](simonmichael/hledger#1788))
- Updated: [Declaring accounts > Account types](https://hledger.org/hledger.html#account-types)
* hledger-lib 1.28 2022-12-01
- Hledger.Utils.Debug's debug logging helpers have been unified.
The "trace or log" functions log to stderr by default, or to a file
if ",logging" is appended to the program name (using withProgName).
The debug log file is PROGNAME.log (changed from debug.log).
- Moved from Hledger.Utils.Debug to Hledger.Utils.Parse:
traceParse
traceParseAt
dbgparse
- Moved from Hledger.Utils.Debug to Hledger.Utils.Print:
pshow
pshow'
pprint
pprint'
colorOption
useColorOnStdout
useColorOnStderr
outputFileOption
hasOutputFile
- Rename Hledger.Utils.Print -> Hledger.Utils.IO, consolidate utils there.
- Hledger.Utils cleaned up.
- Hledger.Data.Amount: showMixedAmountOneLine now also shows costs.
Note that different costs are kept separate in amount arithmetic.
- Hledger.Read.Common: rename/add amount parsing helpers.
added:
parseamount
parseamount'
parsemixedamount
parsemixedamount'
removed:
amountp'
mamountp'
- Hledger.Utils.Parse:
export customErrorBundlePretty,
for pretty-printing hledger parse errors.
- Support megaparsec 9.3. (Felix Yan)
- Support GHC 9.4.
- Update cabal files to match hpack 0.35/stack 2.9
* hledger-lib 1.27 2022-09-01
Breaking changes
- Support for GHC 8.6 and 8.8 has been dropped.
hledger now requires GHC 8.10 or newer.
- Hledger.Data.Amount: `amount` has been dropped; use `nullamt` instead.
- journal*AccountQuery functions have been dropped; use a type: query instead.
cbcsubreportquery no longer takes Journal as an argument.
(#1921)
Misc. changes
- Hledger.Utils.Debug now re-exports Debug.Breakpoint from the
breakpoint library, so that breakpoint's helpers can be used easily
during development.
- Hledger.Utils.Debug:
dlog has been replaced by more reliable functions for debug-logging
to a file (useful for debugging TUI apps like hledger-ui):
dlogTrace
dlogTraceAt
dlogAt
dlog0
dlog1
dlog2
dlog3
dlog4
dlog5
dlog6
dlog7
dlog8
dlog9
- Hledger.Utils.Debug: pprint' and pshow' have been added,
forcing monochrome output.
- Hledger.Utils.String: add quoteForCommandLine
- Hledger.Data.Errors: export makeBalanceAssertionErrorExcerpt
- Hledger.Utils.Parse: export HledgerParseErrors
- Debug logging from journalFilePath and the include directive will
now show "(unknown)" instead of an empty string.
* hledger-lib 1.26.1 2022-07-11
- require safe 0.3.19+ to avoid deprecation warning
* hledger-lib 1.26 2022-06-04
Breaking changes
- readJournal, readJournalFile, readJournalFiles now return
`ExceptT String IO a` instead of `IO (Either String a)`.
Internally, this increases composability and avoids some ugly case handling.
It means that these must now be evaluated with `runExceptT`.
That can be imported from `Control.Monad.Except` in the `mtl` package,
but `Hledger.Read` also re-exports it for convenience.
New variants readJournal', readJournalFiles', readJournalFile' are
also provided; these are like the old functions but more convenient,
assuming default input options and needing one less argument.
(Stephen Morgan)
- parseAndFinaliseJournal' (a variant of parseAndFinaliseJournal) has been removed.
In the unlikely event you needed it in your code, you can replace:
```haskell
parseAndFinaliseJournal' parser iopts fp t
```
with:
```haskell
initialiseAndParseJournal parser iopts fp t
>>= liftEither . journalApplyAliases (aliasesFromOpts iopts)
>>= journalFinalise iopts fp t
```
- Some parsers have been generalised from JournalParser to TextParser.
(Stephen Morgan)
Misc. changes
- Allow doclayout 0.4.
- Our doctests now run with GHC 9.2+ only, to avoid doctest issues.
- Hledger.Data.JournalChecks: some Journal checks have been moved and renamed:
journalCheckAccounts,
journalCheckCommodities,
journalCheckPayees
- Hledger.Data.Errors: new error formatting helpers
makeTransactionErrorExcerpt,
makePostingErrorExcerpt,
transactionFindPostingIndex
- HledgerParseErrors is a new type alias for our parse errors.
CustomErr has been renamed to HledgerParseErrorData.
- Hledger.Query: added
matchesQuery,
queryIsCode,
queryIsTransactionRelated
- Improve ergonomics of SmartDate constructors.
(Stephen Morgan)
- Hledger.Utils: Add a helper function numDigitsInt to get the number
of digits in an integer, which has a surprising number of ways to
get it wrong.
([#1813](simonmichael/hledger#1813) (Stephen Morgan)
* hledger-lib 1.25 2022-03-04
- hledger-lib now builds with GHC 9.2 and latest deps.
([#1774](simonmichael/hledger#1774)
- Journal has a new jaccounttypes map.
The journalAccountType lookup function makes it easy to check an account's type.
The journalTags and journalInheritedTags functions look up an account's tags.
Functions like journalFilterPostings and journalFilterTransactions,
and new matching functions matchesAccountExtra, matchesPostingExtra
and matchesTransactionExtra, use these to allow more powerful matching
that is aware of account types and tags.
- Journal has a new jdeclaredaccounttags field
for easy lookup of account tags.
Query.matchesTaggedAccount is a tag-aware version of matchesAccount.
- Some account name functions have moved from Hledger.Data.Posting
to Hledger.Data.AccountName:
accountNamePostingType, accountNameWithPostingType, accountNameWithoutPostingType,
joinAccountNames, concatAccountNames, accountNameApplyAliases, accountNameApplyAliasesMemo.
- Renamed: CommodityLayout to Layout.
This introduces a new csv format which is useful for charting and further processing: tidy csv. Every row of the csv file consists of a single amount, and that row is indexed by the account name, the date, and the commodity. Here is an example: