How to collect usage statistics to make educated design decisions #224

issue MarcoFalke openend this issue on February 24, 2021
  1. MarcoFalke commented at 9:26 am on February 24, 2021: contributor

    Currently the gui is designed (or kept as-is) based on guessing how it is used. I’d say that developers and reviewers mostly use their own experience as a proxy how everyone else is using the gui.

    It would be good if the gui had a way to record usage statistics, which can be enabled by users that want to opt-in to sharing this. The users could then upload the file to share their report with developers.

    I am not sure what ways qt offers to achieve this, but even a plain simple click counter for each button would be better than nothing.

  2. MarcoFalke added the label Brainstorming on Feb 24, 2021
  3. MarcoFalke added the label Feature on Feb 24, 2021
  4. hebasto commented at 9:38 am on February 24, 2021: member
    Could an opt-in log of events / signals be acceptable from the point of view of users’ privacy? What events / signals must be redacted from a such log?
  5. GBKS commented at 10:07 am on February 24, 2021: none

    Instead of capturing “all” events, this could also be driven by hypothesis we want to clarify. For example:

    • How important is it to be able to include multiple recipients in a transaction? If almost no-one uses this (and we find that some other aspect of transactions has more interest), the UI could be adjusted to match user priorities.
    • Is address reuse common? If so, the UI could be adjusted to discourage this.
    • How many users adjust fees? Are those adjusted values reasonable? If not, more education could be added.
    • After starting the application, how long does it usually take until all blocks are synced? If this is super long and uses 200% CPU on average, maybe some sort of optimization is needed
    • What types of wallets are created most? Which types get the most usage?

    I just totally made these up and they might be unrealistic, so ignore the details. My point is that by starting with hypotheses, it’s easier to ensure that only relevant data is gathered, the resulting data set is more likely to be useful, and it’s much easier for users to understand why this is done and be comfortable sharing.

    I’d be more comfortable sharing a data set that includes (totally made up to make my point) lines like “You reuse addresses 3 times on average” and “You typically increase the default fee by 10%” than endless lists of logs.

    The privacy aspect is super critical to this. Ideally users can upload the data completely anonymous.

  6. MarcoFalke commented at 1:52 pm on February 24, 2021: contributor
    If the collected statistics are condensed into a short sentence, they could be collected as part of the next survey. Last one: https://achow101.com/2021/01/bitcoin-core-survey
  7. johnsBeharry commented at 4:08 pm on February 24, 2021: none

    If this is something that would be considered seriously to include then I think it’s very important to establish the constraints that govern data collection upfront. Generally nothing that can be used to correlate balances, identity, or location.

    Off the top of my head, the kinds of sensitive information would be:

    • addresses
    • timestamps
    • amounts
    • transaction hashes
    • ip addresses
    • labels
    • language
    • …?

    Just by themselves usage patterns can be useful, but we will also need to know the environment and configuration of the operator, as it can influence these patterns (features that are removed or renamed). So some additional data points would also have to be collected so the usage data can be segmented. This can also help facilitate debugging, example.

    1. Software Version
    2. Operating System
    3. Machine Specs
    4. UI Configs (exclude language)
    5. …?

    Log files are not the most easy to read, so making this data easily consumable by the operator via a UI would allow them to assess if they are comfortable sharing this data, and could possibly be a useful for them depending on what is being collected. Perhaps we learn more about possible monitoring requirements, and roll those in pending the results of the survey.

    Question: What would the storage requirements be for such kind of logging?

  8. jarolrod commented at 6:38 pm on February 24, 2021: member
    Assuming we abstract away how we will work around consent and protecting privacy. I think a simple solution is to log upon events, then parse the log.
  9. Bosch-0 commented at 11:07 am on March 2, 2021: none

    I agree with GBKS approach of having various hypotheses worth testing. We could decide on hypotheses that focus on collecting data on new features added with that version, for example how users use descriptors in 0.21.0 release. These could be mixed in with some more general meta hypotheses such as what kind of wallets users use / how users construct tx’s etc. The results of achow’s survery will be helpful in deciding what hypothesis we test. All testing should be opt in and be anonymous as possible.

    The opt in should be presented via on-boarding when using the GUI for the first time or via an overlay on first launch when updating to a newer GUI version. The info in the GUI should give a rough overview of what is being collected with a link to more in-depth documentation detailing what / why / when data is being collected (hosted on GitHub / bitcoincore.org?).

    Would also be handy to have an opt in option in the settings if the user changes their mind and wants to contribute data later on.

  10. michaelfolkson commented at 11:36 am on March 2, 2021: member
    I know this has been said before but if this is enabled this must always be opt-in. It should be documented clearly that this must never be made opt-out. I am sure some people will argue against this for the slippery slope reason. That if we introduce this, there is the risk (either intentionally or unintentionally) of this being made opt-out at some point in the future.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/gui. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-23 00:20 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me