Adventures of the Retired Guy

Adventures of the Retired Guy

Stuff and nonsense from a retired guy

22 Jun 2021

Toward a Producer-friendly Data Marketplace

How to put creators of data in charge of using it. While at same time, retaining the current vibrant market for consumer data.

I was reading an article that used the phrase "data and money" , maybe in context of "moving data and money around the internet". But i thought, in these days, data is money, and that's what set me off in this direction. vendors will pay money to websites for access to users. Facebook has capitalized on providing very fine grained information to potential advertisers, and the latter find it valuable and will pay FB a premium. So that's one way data is money. But encryption provides privacy and verification, and everybody is willing to pay (soemthing) for a digital certificate. Another rway. Finally, cryptocurrency, where the data is literally money, not just a valuable commodity.

Lots of ideas like Hailstorm. What's documented thence?

It's a truism that, on the internet, you are not the user, you're the product. Eyeballs and clicks. The Attention Economy.

"Privacy Advocates" decry the loss of control consumerrs have over the collection and use of users' onwn data.

Yet Silicon Valley perfected the art of extracting value out of consumer activity on the internet and has delivered really generally useful services such as Google Search and especially Maps, Twitter, Pandora or if you prefer Spotify. These are my choices for examples of services that are free to the user and support themselves and recover the cost of their development from vendors via advertising or, in worse cases, selling the data itself to entities that consider it valuable: commercial, governmental, private/civilian.

I think this makes life better for most people, though it's a mixed prospect. There can be big downside to some users if information is misused, meaning used in ways the user's hadn't thought about or bargained for when they generated the data.

Maybe also part of this problem is the marketplace for creative content, intellectual property: sound, image, code or written word whose creators would like to earn a living doing what they do and society would generally benefit from having access to. Current marketplace congested with agents, restricted channels of distribution, abusive contracts and lawyers, other ills. Creators naturally want some assurances, but the only ones available are provided by intermediaries who favor themselves first when providing distribution as a serrvice.

Problem summary

Currently, web service and web site owners set the terms by which they collect data from users and dispose of it to their own customers. Makes users into, not wage slaves, but data slaves, locked into producing a commodity but not sharing the full value of what they have produced, and definitely not free to take their business elsewhere (network (social, messsaging) lockin, walled garden lockin) They may be aware of what they have consented to, but they were the loser in the negotiation and are usually aware of that and resentful of it. If users could get what they percieve as a better deal, especially if they could get a piece of the action, they would probably engage more, to the site/service owner's benefit as well.

Of course, this would tend to make users into something like data prostitutes: willing to sell anything about themselves for a price. You could argue users need protections against the worst excesses, but not sure where the authority comes from to set the limits. Social? family? Church?

Vision

["person" -- the producer of some data]
["customer" -- (potential) consumer of some data]
["(the) system" -- the nuew system being proposed]

What if there was a system...

  • present at every interface where person produces potentially merchantable information. (information could be actual creative content or just transactional or PII)
  • acts as broker employed by the person, able to negotiate for a deal for sharing the information. Sharing a copy, not disposing of the original. Could be exclusive or non-exclusive, depending.
  • retains a log of all transactions, so source of data always provable.
  • sufficiently wide spread so a person could use it for most transactions and kinds of data s/he generates. Potential customers for data are not going to like using it, the system will have to be an unignorable market opportunity to get them to use it. But cannot expect system to be mandated, to gain market share by fiat.
  • automates the negotiation at point of data generation, so user not slowed down by "mature ereconsideration" "awed by the majesty of the ..." "friction"
  • person trusts the system because s/he understands the limits on the use of data (different limits in different kinds of data). Limits are trustworty because [system fairy dust]. Maybe data in escrow? customer must ask and explain each access with transparent auditing? Blockchain is trustworty record!

Considerations / Key Challenges

  1. insert system into user browser, but also into frameworks for web, mobile and IOT.
  2. Fairly painless way for customer to review and agree to the new rules and to migrate to adoption while continuing to use existing solutions in parallel (for persons who have not yet adopted system)
  3. system must anticipate very fine-grained distinctions in potential uses of many categories of data, so user can review them ahead of time instead of case-by-case. With an easy way to interrupt the usere at "point of sale" to resolve novel situations. So person buy-in is key.
  4. examples of Categories/kinds of data
    1. Identity
      1. for advertising purposes, means bundle of interests, ability to pay, support "window shopping" by "unqualified persons. Not for contact purposes. Maybe degrees of verified accuracy? So "interested in sports car" is low valitity, but "certified owner of Boxster" is high.
      2. as a real person.. name, address, government id, ... provable for institutional, legal purposes, not just a social "handle" or alias.
      3. (adopted) "alias" identity. Person can have multiple, can be used in social situations, recognizable as not binding.
    2. Location
      1. Physical, currently at website ... (and loyalty to it/ stickiness at it)
    3. "Interests"
      This is whata advertisers want to know, but it's an overplowed field, so may others have tried this approach. Maybe we provide a way to review the other kinds of data, let potential advertiser decide whether person might be interested? [[ customer sends a query using system schema, we return results. Query returns a single "score"? or other nonPII]]
  5. examples of intendeed uses

Instructive Scenarios

  1. customer wants to know what to advertise to person. person doesn't want to disclose much, but would disclose more if customer pays a bit to person for the answer. Pays a bit not same as offers a discount. System wants user to see a return within the system for using it.

meta

  • It really is a marketplace of data, with real prices and real world value to person and customer, not the hippie notion of data is free, data wants to be free
  • But leverage "open source" movement to reign in current capitalist-entrepreneurial dictating the terms of internet life.
  • Transparency is key. Full transparency between parties to the transaction, fuzzy visibility to other participants in the system, managed but non-discretionary transparency to regulatory authorities and law enforcement.

Next steps