Future Technologies Conference 2023: Thoughts and Reflections

“Nobody can predict the future. But, we have to try.”

In early November, I was invited to speak at the Future Technologies Conference (FTC) in San Francisco. I was grateful for the opportunity to take the stage and share my research, and equally grateful to be present and learn from others. The quote above was stated by the keynote speaker, Tom Mitchell, who opened the conference. While Mitchell’s words hold a certain truth, they also invite us to delve deeper into the intricate dance between anticipation and adaptability in our rapidly changing world.

Any time I think about the future, I am brought back to a Bolivian proverb which I learned from distinguished writer, sociologist, feminist, and activist Silvia Rivera Cusicanqui. The saying, roughly translated, goes like this:

“The future is at my back. I stand in the present, and all we can see is the past.”

It’s an idea that initially may seem foreign, but as it settles in, it reveals a profound wisdom. We can only tell what is likely to happen by what has already happened. In Western culture, we’re often conditioned to fixate on the future, perpetually reaching for it, striving to predict and shape what’s to come. However, the reality is that while we can make educated guesses, the future remains shrouded in uncertainty. The unpredictable nature of the future raises questions about the power of self-fulfilling prophecies.

As we delve into the realm of future technologies, the landscape is vast and bustling with innovation. Many breakthroughs are driven by predictive abilities, often harnessed for purposes related to commerce, such as the orchestration of emotional responses in marketing endeavors. We are in a time where technologies are being deployed at great speed with very few guardrails in place, disregarding potential consequences. It is essential to recognize that these powerful technologies are not solely defined by their applications, but also by what goes into the making of them, including raw materials and data.

The creators behind new technologies, such as some of the speakers at the conference, often have noble intentions, with a focus on critical global issues like climate change, sustainability, public health, and technologies that enhance the well-being of individuals. Nevertheless, there is a recurring pattern where technologies take unforeseen paths, diverging from their original intentions. This becomes particularly complex when dealing with formidable forces like artificial intelligence (AI) and virtual reality (VR). These technologies are developing incredibly rapidly, with nearly endless possibilities, along with a lot of ethical concerns.

The topics covered at FTC included:

  1. Deep Learning

  2. Large language models

  3. Data Science

  4. Ambient Intelligence

  5. Computer Vision

  6. Robotics

  7. Agents and Multi-agent Systems

  8. Communications

  9. Security

  10. e-Learning

  11. Artificial Intelligence

  12. Computing

In this post, I will give an overview of my experience at the conference, including the talk that I presented, and share some highlights from other presentations.

My Presentation

The Future Technologies Conference covered a wide range of topics and specialties, with mine falling under data science. My presentation focused on the topic of data mining, titled: “In Consideration of Indigenous Data Sovereignty: Data Mining as a Colonial Practice”. During my talk, I explained what Indigenous Data Sovereignty is and why it is important, before describing the CARE Principles of Indigenous Data Governance: Collective Benefit, Authority to Control, Responsibility, and Ethics.

To exemplify how the CARE principles could be applied throughout the data lifecycle, I reviewed a case study to first show how the above principles are often missing in data collection practices.

A European NGO went to Burundi to collect data on water accessibility.[1]

The NGO failed to understand:

1. The community’s perspective on the actual central issues

2. The potential harms of their actions

By sharing data publicly, which included geographic locations, the NGO put the community at risk.

Collective privacy was violated and there was a loss of trust.

The CARE principles were violated, particularly Collective benefit and Responsibility.

I closed the talk with some recommendations. There is a need to remediate three main data challenges: data collection, data access, and relevance in order to allow for access, use and control of Indigenous peoples’ own data and information.[2] It is vital to note that there are varying local concerns in different regions, although all have been negatively influenced and impacted by long standing exploitative colonial practices. It is imperative that we continue to educate ourselves and question broader narratives that stem from colonial roots.

It was apparent that many attendees hadn’t considered the approach I presented, yet it resonated with them. I hope I prompted the attendants to think about data from a different perspective and think about the people behind the data.

I was amazed with how well it was received and the feedback and questions that I got. One question was about the case study that I presented. I was asked how exactly the people in question were harmed by data collection that did not adhere to the CARE principles? I explained in more detail how the community in Burundi from the case study was harmed by researchers who ended up sharing their personal data including location data, which broke the community’s trust. This shows that privacy is not only a personal issue, but for Indigenous communities, privacy is often a collective issue. This might not be considered from a Western perspective, which views privacy as purely individualistic. By expanding our understanding of human values and how they can vary culturally and regionally, we can better understand how data collection and new technologies will affect different populations.

Afterwards, many people approached me and wanted to discuss my research further. The best comment I got was simply: “Your research is so rare.”

That is why I wanted to present this work at this particular conference on future technologies, because so much of it relies on data, and a lot of data. Data often comes from people, and it has a lot of value. Some say it is more valuable than any other resource. Most people benefit through convenience of the learnings and apps developed by the use of public data. However, why is it only corporations who benefit monetarily from everyone’s data? Why is this outright exploitation allowed? Isn’t it neo-colonialism at work? This is the message I was getting across.

Notable Talks at FTC

The talks ranged across the board of new and future technology, centering around AI, VR, and more. At lunch, I met Myana Anderson, who told me that she was speaking about bears. What her talk — Metabolic Adaptation in Hibernating American Black Bears — was about was how bears have something that humans don’t: the ability to be sedentary for long periods of time, in order to hibernate. For humans, our bodies are made to move, and if we are too sedentary we get blood clots and all sorts of health issues. Her and her fellow researchers studied blood samples from hibernating bears to see what exactly it is that allows bears to remain immoble and maintain homeostasis. They collected this data and studied it to see what we could learn to reflect on for treating a variety of sedentary-related human diseases, and people with conditions that worsen with immobility.

This was certainly unique and compelling research that could potentially benefit people with disabilities and illnesses who are immobilized. However, there is an aspect that worries me, in that, are we headed for a dystopia of immobility, where life is lived in VR and people turn into blobs and hibernate like bears, due to this research? This was not mentioned, but is purely my speculation. Is this really the direction we want to go in? Why can’t we find ways to keep up being mobile and active in our world? Or, would this research truly be used just to support people with conditions that forced them to be immobile, and not for the general population to be able to sit without moving for long periods?

It was interesting because it was far-removed from my own research, but I remain slightly worried about how it will be used. It was also interesting to consider a study that does not use data from humans, but from animals, and may necessitate the consideration of animal rights going forward.

Another of the talks which stood out to me was called The Effects of Gender and Age-Progressed Avatars on Future Self-Continuity, by Marian McDonnell. The research was on the use of VR to create age-progressed avatars, in an effort to make people have more empathy for their future selves and save more money for their retirements. The idea is that people find it easy to empathize and care for their parents and children, but not for themselves in the future. This is true, and they did find that this was effective in getting people to think about their retirement more and put money away for their own futures. However, the most interesting thing in this study was that there were differences for men versus women. When men were introduced via VR to an older version of themselves, they thought they looked like their father, and thought it was neat. When women did the same, they were shocked and terrified to see themselves aged.

Women and men are socially and culturally treated so differently with such different expectations, including around aging. Older women are not represented as having worth, rather, they are really quite invisible, where older men are shown as being still attractive, as well as having an air of respect that women are not afforded. This social conundrum became very clear through their study, and they made sure to include it as a vital part of the research, which I thought was notable. Repairing these inequalities might take more than VR, but might be an interesting body of research to approach. These deep-seated inequalities make themselves visible during projects such as this, and it presents an opportunity to address them in appropriate and creative ways.

Final Thoughts

Throughout the conference, there were instances where I hesitated to express my thoughts, observing a prevailing emphasis on valuing speed and sales, and a lack of responsibility and transparency considerations. While some discussions touched on the ethical dimensions of technology, particularly in environmental applications, the technical details often delved into intricacies beyond my social science expertise. It was an opportunity to work on my own knowledge development in technical areas, and share knowledge with others in adjacent fields. That is why in-person conferences are so vital, so that the knowledge shared can mesh together and those in attendance can come away with a better understanding of things that may have been overlooked.

As I sat in attendance, occasions arose where I wished to inquire about ethical considerations. In one of these moments, another participant raised a question about my same concerns, only to receive a response acknowledging a lack of expertise in that domain. I found this a bit concerning, however it highlights the necessity of safety and responsibility in what we are building now and in the future.

In addressing the rapid evolution of the present into the future, concerns inevitably arise. Rather than understanding these as worries, reframing them as foresight becomes crucial for establishing checks, balances, and comprehensive protections around emerging technologies. This includes considerations not only during implementation but also at the initial stages of the data lifecycle, ensuring safeguards at every level without causing unintended harm. The question persists: Can we mitigate potential harms associated with new technologies, or is some level of harm inevitable?

Presently, an opportune moment exists for integrating ethics into technological discourse. However, it is imperative to approach this integration with an awareness of historical and existing systemic contexts. This nuanced approach is essential to navigate ethical considerations in a manner that acknowledges the complexities of past and current systems.

References

[1] Abebe, Rediet, Kehinde Aruleba, Abeba Birhane, Sara Kingsley, George Obaido, Sekou L. Remy, and Swathi Sadagopan. “Narratives and Counternarratives on Data Sharing in Africa.” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 329–41. Virtual Event Canada: ACM, 2021. https://doi.org/10.1145/3442188.3445897.

[2] Rodriguez, Oscar Luis Figueroa. “Indigenous Policy and Indigenous Data in Mexico: Context, Challenges and Perspectives.” In Indigenous Data Sovereignty and Policy, 130–47. Routledge Studies in Indigenous Peoples and Policy. Abingdon, Oxon ; New York, NY: Routledge, 2021



AI Regulations in the US: Will We See Implementation and Environmental Protection?

Introduction

Amidst the current buzz of user-centered AI, the spotlight often shines on individual gains, but isn't it time we cast a broader gaze? While I'm all for enhancing user experiences, let's also ponder how AI's ripples extend to impact not only humanity but the very fabric of our natural world. How is the exponential growth of data affecting climate change through the need to store all of the data in data centers, which AI relies on so heavily, as well as produces? How is it marginalizing to people who may not even realize that AI is at play, such as in decision making for jobs or loans? These are the types of questions that we need to be asking. 

There are two major documents in the US that are meant to serve policy making on AI. One is the AI Bill of Rights. On occasion, when I raise this topic, individuals tend to misinterpret it as advocating for rights for AI entities. I must clarify that we're not delving into that discussion at this juncture. The AI Bill of Rights deals with human rights around AI, such as that a person should always be made aware when they are talking to a bot and have the choice to talk to a human. In addition, rights must be established which confront unchecked social media data collection and protect against discrimination. As you can imagine, the ‘rights’ in the AI Bill of Rights are not currently being implemented, as well meaning as they are. And I must highlight that the one huge area that got left out of the AI Bill of Rights is any concern for the environment, which is intrinsically linked to human rights. 

The second document is the AI Risk Management Framework. (AI RMF) This one does have several places where they talk about AI’s effects on the environment, as well as many other areas of concern. The AI RMF is aimed at general groups of AI Actors. These are separate from the primary audience of the AI RMF. They include trade associations, standards organizations, researchers, advocacy groups, environmental organizations, civil society, end-users, and potentially affected individuals and communities. According to the document, these AI actors can: 

 - Provide context and understanding of potential and actual AI impacts.

  - Offer formal or quasi-formal norms and guidance for AI risk management.

  - Define boundaries for AI operations (technical, societal, legal, and ethical).

  - Encourage discussions on balancing societal values and priorities, including civil liberties, equity, the environment, the economy, and human rights. (Tabassi, 2023)

It is a positive thing to be working on regulation around AI, and we will see more and more of this as we roll forward. How much it actually helps is hard to tell, as corporations are mostly self-regulating, which they cannot be trusted to do. According to Timnit Gebru, founder of  Distributed AI Research Institute (DAIR) and the co-founder of Black in AI,  “the #1 thing that would safeguard us from unsafe uses of AI is curbing the power of the companies who develop it and increasing the power of those who speak up against the harms of AI and these companies’ practices.” (2021) In contrast to the way big tech executives frame it as an arms race, the real obstacle to innovation lies in the existing system where a small group creates technology with harmful consequences, while others are consistently occupied with mitigating that harm, leaving them with limited opportunities to realize their own vision for the future due to constraints on time, resources, and space. (Gebru, 2021)

How to implement regulations both on the corporate level and at the level of users is very important and also very challenging. Just as when cars first came to the world, roads needed to change, guardrails needed to be put up, and rules needed to be put in place. What analogous measures must we consider when evaluating risks associated with AI, akin to speed limits and safety belts? 

We Need to Focus on User-centered AI, not Corporations, but How?

Organizations advocating for civil rights, such as the Electronic Privacy Information Center, have actively engaged in the broader discourse concerning AI regulations. They have expressed reservations about the idea that industry associations should hold substantial influence in shaping policies related to a rights-oriented document formulated by the White House. (Krishan, 2023)

Formally titled as the Framework for an AI Bill of Rights, the manuscript, released in October 2022, is an outcome of a joint effort involving the Office of Science and Technology Policy (OSTP), scholars, organizations championing human rights, the wider public, and major corporations including Microsoft and Google. This blueprint offers recommendations to enhance the transparency, equity, and security of AI applications. It delves into the immediate and potential injustices related to civil rights brought about by AI, with a particular focus on domains like employment, education, healthcare, financial access, and business surveillance. (The White House, 2023)

Summary of AI Bill of Rights

When the White House introduced its vision for an AI 'Bill of Rights,'  they presented an approach centered on human rights as the foundational basis for regulating AI. This was succeeded by the AI RMF in January, which adopted a risk-centric perspective. This framework aimed to assess the extent and nature of risks associated with specific use scenarios, identifying potential threats to establish a sense of reliability in AI technology. (Krishan, 2023) 

The following are the main areas that the AI Bill of Rights addresses:

  • SAFE AND EFFECTIVE SYSTEMS: You should be protected from unsafe or ineffective systems.

  • ALGORITHMIC DISCRIMINATION PROTECTIONS: You should not face discrimination by algorithms and systems should be used and designed in an equitable way.

  • DATA PRIVACY: You should be protected from abusive data practices via built-in protections and you should have agency over how data about you is used.

  • NOTICE AND EXPLANATION: You should know that an automated system is being used and understand how and why it contributes to outcomes that impact you.

  • HUMAN ALTERNATIVES, CONSIDERATION, AND FALLBACK: You should be able to opt out, where appropriate, and have access to a person who can quickly consider and remedy problems you encounter. (The White House, 2023)

These are all incredibly important areas of focus, but I have two major concerns left unanswered for. The first is frankly, how and when are these to be implemented? And second of course is my issue with the lack of concern for the environment, and the intrinsic connection between human rights and necessary protections of the natural world, which are direly missing from this bill of rights. 

Summary of AI Risk Management Framework

The AI Risk Management Framework (AI RMF) acknowledges that while artificial intelligence (AI) holds immense potential to positively impact various aspects of society and the environment, it also carries unique risks that can affect individuals, organizations, and communities. Unlike traditional software systems, AI systems can be influenced by evolving data and complex contexts, making risk detection and response challenging. These socio-technical systems are susceptible to amplifying inequalities and undesirable outcomes, but responsible AI practices, emphasizing human-centricity and social responsibility, can help mitigate these risks. AI risk management is essential for fostering responsible AI development and use, enhancing trustworthiness, and building public trust in this transformative technology. (Tabassi, 2023

Figure 1 above is directly from the AI RMF (Tabassi, 2023) and provides an abbreviated look at potential and actual harms of AI using broad strokes. As seen in the figure, a major branch is addressing environmental concerns, unlike the AI Bill of Rights, which is vital to address, as if nothing is done, the harm could be compounding over time. 

Expert Reactions to the AI Bill of Rights and the AI Risk Management Framework

Nicole Foster, in charge of shaping worldwide policies related to AI and machine learning at Amazon Web Services, highlighted a major issue regarding the documents. She pointed out that the primary source of concern lies in the conflicting interpretations of the technology's very essence. (Krishan, 2023) The way that the two documents define AI is seemingly contradictory.

According to experts in AI policy, the absence of clear directives from the White House regarding how to reconcile contrasting perspectives on AI—those centered on rights and those on risks—has become a significant obstacle for companies striving to develop innovative products while ensuring necessary protections. (Krishan, 2023)

Personally, I think that they could and should be implemented together, and honestly I feel that arguing over definitions seems like an excuse to not abide by the necessary regulations that they suggest.  

Patrick Gaspard, who serves as the CEO and president of the Center for American Progress, acknowledged the substantial commitment made by the Biden administration in crafting the significant Blueprint for an AI Bill of Rights. Gaspard highlighted the impending AI executive order as an ideal occasion to transform these guiding principles into enforceable policy within the United States. He emphasized that this presents an opportune moment for the president to prioritize democracy and individual rights in shaping the trajectory of these influential tools. (Hananel, 2023)

Ben Winters, the senior legal advisor at EPIC overseeing their efforts in the realm of AI and human rights, expressed his regret that the business sector is displeased with the policy document's alignment, or lack thereof, with their profit-oriented motives—namely, generating revenue, exploiting individuals' data, and securing additional contracts. He emphasized that the document is a policy-oriented text, and as such, industry entities do not possess the authority to author it according to their preferences. (Krishan, 2023)

We Need to Include Concerns for the Environment/Natural World

Unlike the Blueprint for the AI Bill of Rights, the AI Risk Management Framework (RMF) does include protections for the environment. For instance, it states that AI systems “should not, under defined conditions, cause physical or psychological harm or lead to a state in which human life, health, property, or the environment is endangered.” (Source: ISO/IEC TS 5723:2022) (Tabassi, 2023 p.13) 

This and the rest of the 36 page document contain a lot of important considerations and regulatory needs for AI risks, however, there are some major questions as to how to make regulations happen. Who will be the AI police? Who will be the highway patrol? It is much harder to see how and when AI is causing harm, unlike with cars on the road, when it is easy to see that a car crash has happened, or to measure the carbon output of a vehicle.

Certainly, one example of how AI can be damaging to the environment is through its association with energy consumption and carbon emissions. Here's how:

1. Increased Energy Demands: AI models, particularly deep learning models, require significant computational power to train and execute tasks. This necessitates the use of powerful hardware, including graphics processing units (GPUs) and specialized AI chips, which are energy-intensive. Large-scale AI training processes, such as training a language model like GPT-3, can consume substantial amounts of electricity over extended periods.

2. Data Center Operations: Many AI applications, especially those involving big data and machine learning, rely on data centers for processing and storage. These data centers operate 24/7 and require vast amounts of energy for cooling and maintaining server infrastructure. Cooling alone can account for a significant portion of the energy consumption.

3. Manufacturing of Hardware: The production of AI hardware components, including GPUs and specialized AI chips, involves resource-intensive processes and the mining of rare materials. These processes contribute to environmental degradation and carbon emissions.

4. E-waste: As AI technologies evolve rapidly, older AI hardware can become obsolete. The disposal of electronic waste (e-waste) generated by outdated AI equipment can have detrimental effects on the environment if not managed properly.

5. Indirect Environmental Impact: AI is also used in applications like optimizing logistics and transportation systems. While these applications can reduce energy consumption in some cases, they may lead to increased overall energy use if not implemented thoughtfully.

6. Data Centers Location: The location of data centers matters. If they are powered by fossil fuels in regions with high emissions, the carbon footprint of AI operations can be significant.

Efforts are being made to mitigate these environmental impacts. This includes research into more energy-efficient AI algorithms, the use of renewable energy sources for data centers, and the development of AI hardware with lower power requirements. However, it's important to recognize that AI's environmental impact is a complex issue that requires ongoing attention and sustainable practices to minimize its harm to the environment.

We have the regulations that say this shouldn’t happen. We have an opportunity to step up and make AI more ethical, by not letting it hurt anyone, or the world around us, and to let it help us in making non-harm the norm. 

The chief science advisor to President Biden has indicated that a growing awareness of the potential hazards associated with AI is driving a pressing initiative to establish protective measures. (Eliasgroll, 2023) When will we see these protective measures be implemented? And further, what side-effects might these implementations have that are unforeseen? 

A central motif revolves around the transformation of aspirations into practical steps: How can we effectively put into practice the substantial promises that have been put forth?

 A particularly potent instrument that fills us with enthusiasm, and one that we anticipate will garner significant attention in the forthcoming years, is the utilization of AI as a pivotal player in addressing climate challenges. The horizon holds an increasing array of prospects, spanning across governments, enterprises, non-profit organizations, as they harness AI's capabilities to expedite their endeavors dedicated to combating climate change. (Toplensky, 2023)

As an illustration, consider the case of UPS, which is leveraging machine learning solutions to enhance the optimization of their truck routes. This initiative has resulted in the conservation of tens of millions of gallons of fuel annually, thereby curbing corresponding emissions. Furthermore, we've witnessed AI applications revolutionizing the synchronization of traffic lights within urban landscapes. An example lies in collaboration on the Greenlight project in Hamburg, Germany, where the reduction in stop-and-start occurrences translated to a 10% decrease in emissions. (Toplensky, 2023)

Conclusion: There is Much Work to be Done, and Fast 

The AI Bill of Rights constitutes a collection of principles crafted by the OSTP, aiming to steer the ethical development and application of artificial intelligence. This initiative emerges in the context of a worldwide endeavor to formulate increased regulations that oversee the realm of AI. (The White House, 2023) This is all well and good, but we need to see it put into action. To conclude, here is an outline of just one aspect that I would like to see implemented, defined by the problem, proposed solution, what we already know, and what we need to know. 

Problem: The development and deployment of AI can harm the environment and exacerbate climate change. The profit-driven approach to AI development may prioritize the interests of big companies over the well-being of humanity.

Solution: The AI Bill of Rights should include provisions that prioritize the betterment of humanity and its influence on the natural environment. 

We already know: We are in a climate crisis. We must be careful not to view AI as a solution for all problems without considering its costs and benefits, (van Wynsberghe, 2020) such as the high energy consumption needed to train and run algorithms. (Coeckelbergh, 2021) We know that “. . .the tension at the heart of climate AI is that it reproduces the very problems it claims to be solving: those of the climate crisis.” (Baker & Gabrys, 2022)

We need to know: The potential impacts of AI on the environment and climate, including the potential risks and benefits of AI in addressing environmental challenges. We need to understand the economic incentives and disincentives that may drive or hinder the development of environmentally-friendly AI.

This article is a call to action to refocus priorities, pull back from sci-fi level fears of AI, and understand what is already in place for AI regulation, but not being implemented. How do we get that to happen? What can we as individuals do?

Take Action: It's time for all of us to play a crucial role in shaping the future of AI. Stay informed, support advocacy groups, engage with your elected representatives, and participate in public consultations. Encourage ethical AI practices, promote transparency, and be a responsible consumer. Raise awareness, advocate for privacy, and support legislation that prioritizes safety and fairness. Remember, your voice matters in the development of AI regulations that safeguard our society's interests. Together, we can ensure that AI technologies serve humanity responsibly and ethically. 


Resources

Baker, K., & Gabrys, J. (2022). Earth for AI: A Political Ecology of Data-Driven Climate Initiatives. Geoforum, 131, 1-10. https://doi.org/10.1016/j.geoforum.2022.01.016

Coeckelbergh, M. AI for climate: Freedom, justice, and other ethical and political challenges. AI Ethics 2021, 1, 67–72. 

Eliasgroll. (2023, August 12). White House is fast-tracking executive order on artificial intelligence. CyberScoop. https://cyberscoop.com/white-house-executive-order-artificial-intelligence/



Gebru, T. (2021, December 6). For truly ethical AI, its research must be independent from big tech. The Guardian. https://www.theguardian.com/commentisfree/2021/dec/06/google-silicon-valley-ai-timnit-gebru



Hananel, S. (2023, August 3). RELEASE: Civil Rights, Tech Groups Call on Biden To Protect Public from Harms of AI. Center for American Progress. https://www.americanprogress.org/press/release-civil-rights-tech-groups-call-on-biden-to-protect-public-from-harms-of-ai/



Krishan, N. (2023, August 24). Experts warn of ‘contradictions’ in Biden administration’s top AI policy documents. FedScoop. https://fedscoop.com/experts-warn-of-contradictions-in-biden-administrations-top-ai-policy-documents/



Perrigo, B. (2022, February 17). Inside Facebook’s African Sweatshop. Time. https://time.com/6147458/facebook-africa-content-moderation-employee-treatment/



Tabassi, E. (2023). AI Risk Management Framework. https://doi.org/10.6028/nist.ai.100-1



The White House. (2023, March 16). Blueprint for an AI Bill of Rights | OSTP | The White House. https://www.whitehouse.gov/ostp/ai-bill-of-rights/



Toplensky, R. (2023, June 22). Google’s CSO Kate Brandt on how AI can accelerate climate action. WSJ. https://www.wsj.com/articles/googles-cso-kate-brandt-on-how-ai-can-accelerate-climate-action-8410242c



van Wynsberghe, A. Artificial Intelligence. From Ethics to Policy; Study, Panel for the Future of Science and Technology, European Parliamentary Research Service (EPRS), Scientific Foresight Unit (STOA); 2020.



In consideration of Indigenous Data Sovereignty


“Indigenous Peoples have always been ‘data warriors’. Our ancient traditions recorded and protected information and knowledge through art, carving, song, chants and other practices.” (Kukutai, 2020)

Long before the advent of computers and the written word, Indigenous Peoples had a rich tradition of transmitting knowledge and data from one generation to the next. However, the concept of Indigenous Data Sovereignty (ID-SOV) is a relatively recent development, with its first official documentation dating to 2016. (Taylor & Kukutai) This post will review ID-SOV and the CARE principles of Indigenous data governance in an effort to move towards decolonizing data. 

ID-SOV can be described as the right of Indigenous Peoples to possess, manage, access, and have authority over data that originates from their communities and relates to their members, traditional knowledge, customs, or lands. (Kukutai, 2020)

To state something as a right is one thing, however to see it carried out, we must detangle from a long history of manipulation of data on Indigenous peoples, who were historically demonized to justify settler colonialism. Now, when neo-colonialism is rife, we see how this narrative continues by victimizing Indigenous peoples. I align with those who argue that this narrative needs to change. According to the Global Indigenous Data Alliance (GIDA), building strategic relationships with global bodies and mechanisms is necessary to promote ID-SOV and governance internationally by providing a visible, collective approach. (Kukutai, 2020)

Even today, sensitive COVID-19 data is being mined and reused without consent from Indigenous Americans by the media, researchers and non-governmental organizations, under the assumption that making tribes more visible would be helpful but actually causing unintentional harm in the process. (RDA COVID-19 Indigenous Data WG, 2020) Settler colonialists thought that they were ‘helping’ too, via ethnic cleansing and direct violence. Where neocolonialism is not inherently violent, it is extremely dangerous (Couldry & Mejias, 2021), and tracing the histories can help us understand how to move towards decolonizing data for the benefit of all. 

Decolonizing Data Via Self-Determination

Data and data analytics have become increasingly important and interdependent in many ways in the digital age. Even governments are heavily reliant on data for their decision making and policies. As has been the case in much of our history, the unwilling targets of policy interventions are disproportionately Indigenous Peoples, whose enduring aspirations for self-determination over their own knowledge, information systems, institutions and resources get undermined by governments. Data is extracted from Indigenous Peoples, their territories, and cultural heritage without seeking their consent or involvement in decisions regarding the collection, utilization, or application of this data. (Walter et al. 2021)

To have the conversation about ID-SOV, let us first discuss the difficulty in defining what it means to be Indigenous. As per The UN Declaration on the Rights of Indigenous Peoples (UNDRIP), indigeneity is intricately tied to the point of initial colonial contact, which can prove challenging to ascertain in regions where colonizers did not establish permanent settlements. The term 'tribes,' though sometimes practical, carries with it problematic colonial connotations. Nevertheless, the label 'indigenous' possesses a broader scope, encompassing a diverse range of ethnic groups, including tribes like the hill tribes residing in the Mekong River area of Southeast Asia (Scott, 2009). A common thread among Indigenous Peoples is their strong inclination toward preserving their autonomy. Simultaneously, they frequently confront marginalization and discrimination, often framed within a narrative that portrays them as victims. (Chung & Chung 2019 P7)

In the pursuit of decolonization, it's crucial to emphasize that the concept of 'Indigenous' itself was a construct devised by colonizers to delineate who was considered fully human and who was relegated to a status deemed less than human (Scott, 2009). It is inherently problematic that we continue to operate within the framework established by this historical perspective. When it comes to the contemporary mission of decolonizing data, a pivotal starting point lies in the recognition of Indigenous Data Sovereignty. By placing the focus on those who have endured the most severe marginalization due to colonialism, we may uncover a clearer path forward in our journey towards decolonization.

There are many concerns from Indigenous groups, such as those in the Mekong area, referred to as Indigenous ethnic minorities (IEM). Many contradictions arise that result in security risks and the impact of sharing IEM data could be both positive and negative in ways unanticipated. A balance of freedoms is required - transparency versus personal security. (Chung & Chung 2019 P12

Within this contradiction lies a major difficulty: how to have accessible and transparent data, while also ensuring the right to privacy for the subjects of that data. This presents the deeper issue which is that data does not promote change automatically nor address issues of marginalization, colonialism or discrimination, not to mention combatting imbalances of power in negotiations and consultations led by governments. (Chung & Chung 2019 P20)

Open Data initiatives raise apprehensions within ID-SOV networks because they often lack safeguards for Indigenous Peoples. There is a growing emphasis on expanded data sharing, exemplified by the widely embraced FAIR principles (Findable, Accessible, Interoperable, Reusable). Nevertheless, this trend has generated tensions when it comes to safeguarding, sharing, and utilizing data pertaining to Indigenous Peoples. To promote meaningful engagement between data collectors and users with Indigenous perspectives, the CARE Principles provide a valuable framework for deliberating upon responsible data utilization.  (Kukutai, 2020)

CARE Principles for Indigenous Data Governance 

While the FAIR principles primarily focus on data itself and overlook the ethical and socially responsible aspects of data usage, such as power imbalances and historical contexts related to data acquisition and utilization, the CARE principles prioritize the welfare of Indigenous Peoples and their data. They can be integrated alongside the FAIR Principles across the entire data lifecycle to ensure mutual advantages and address these broader ethical considerations.(RDA, 2020 P57)

CARE Principles

Collective Benefit

Data ecosystems shall be designed and function in ways that enable Indigenous Peoples to derive benefit from the data

Authority to Control

Indigenous Peoples’ rights and interests in Indigenous data must be recognised and their authority to control such data be empowered. Indigenous data governance enables Indigenous Peoples and governing bodies to determine how Indigenous Peoples, as well as Indigenous lands, territories, resources, knowledges and geographical indicators, are represented and identified within data

Responsibility

Those working with Indigenous data have a responsibility to share how those data are used to support Indigenous Peoples’ self determination and collective benefit. Accountability requires meaningful and openly available evidence of these efforts and the benefits accruing to Indigenous Peoples. 

Ethics

Indigenous Peoples’ rights and wellbeing should be the primary concern at all stages of the data life cycle and across the data ecosystem

(Carroll et al. 2020


If these principles can be integrated into systems of open data, it could truly turn towards decolonizing data, however, the need to be more than just principles. If we center on the CARE principles and Indigenous data sovereignty for data governance on a global scale, perhaps we can steer away from harmful colonial data mining and towards a more balanced relationship with data. 





 Resources

Carroll, S. R., Garba, I., Figueroa-Rodríguez, O. L., Holbrook, J., Lovett, R., Materechera, S., Parsons, M., Raseroka, K., Rodriguez-Lonebear, D., Rowe, R., Sara, R., Walker, J. D., Anderson, J., & Hudson, M. (2020). The CARE Principles for Indigenous Data Governance. Data Science Journal, 19. https://doi.org/10.5334/dsj-2020-043

Chung, P., & Chung, M. (2019). INDIGENOUS DATA SOVEREIGNTY IN THE MEKONG REGION. 2019 WORLD BANK CONFERENCE ON LAND AND POVERTY”.

Nick Couldry & Ulises Ali Mejias (2021): The decolonial turn in data and technology research: what is at stake and where is it heading?, Information, Communication & Society, DOI: 10.1080/1369118X.2021.1986102 

Kukutai, T., Carroll , S. R., & Walter , M. (2020). Indigenous data sovereignty . eprints.utas.edu.au. Retrieved March 5, 2022, from https://eprints.utas.edu.au/34971/2/140589-Indigenous%20data%20sovereignty.pdf 

RDA COVID-19 Indigenous Data WG. "Data sharing respecting Indigenous data sovereignty." In RDA COVID-19 Working Group (2020). Recommendations and guidelines on data sharing. Research Data Alliance. https://doi.org/10.15497/rda00052

Taylor, J., & Kukutai, T. (2016). Indigenous data sovereignty toward an agenda. Australian National University Press. 

Walter, M., Kukutai, T., Russo Carroll, S., & Rodriguez-Lonebear, D. (2021). INDIGENOUS DATA SOVEREIGNTY AND POLICY.


K-Means Tutorial in Python

Comment

K-Means Tutorial in Python

Hi everyone,

Welcome back to my series of Machine Learning Algorithms Tutorials, this time we’ll be checking on K-Means, one of the most popular and powerful clustering algorithms in Machine Learning. In this article, I will explain what it is, how it works, and why it’s useful for finding patterns in data. We’ll obviously have our tutorial in Python as well!

I hope you enjoy reading this article as much as I enjoyed researching and writing it. With this series, I developed a fascination for Machine Learning and how it can help us solve complex problems and make better decisions. If you share this passion, then you are in the right place. Let’s dive in!

K-Means is one of the popular and simple clustering algorithms. It is an unsupervised learning technique that aims to partition a set of data points into a number of groups (called clusters) based on their similarity.

The basic idea of K-Means is to assign each data point to the cluster whose center (called centroid) is closest to it. The centroid of a cluster is the average of all the data points in that cluster. The algorithm iterates until the centroids stop changing or a maximum number of iterations is reached.

We need to know what the algorithm needs to work, in this case, K-means requires the following inputs:

  1. The number of clusters (k), where we specify the number of clusters we want the algorithm to group the data into.

  2. Data, that need to be clustered. Each data point should have a set of features or attributes that describe it.

  3. Initial centroids for each cluster. These centroids can be randomly selected from the data points or manually specified.

Determining the optimal number of clusters (k) is an important step in using the k-means algorithm effectively. There are several methods that can be used to estimate the optimal value of k, including:

  • Elbow method, which involves plotting the sum of squared distances between each data point and its assigned centroid for different values of k. The value of k at which the rate of decrease in the sum of squared distances slows down and forms an elbow-like shape is considered the optimal number of clusters.

  • Silhouette method involves calculating the silhouette score for different values of k. The silhouette score measures how similar a data point is to its assigned cluster compared to other clusters. The value of k that maximizes the average silhouette score is considered the optimal number of clusters.

  • Gap statistic method, this technique involves comparing the within-cluster variation for different values of k to a null reference distribution. The value of k that maximizes the gap statistic is considered the optimal number of clusters.

It’s important to note that these methods are not foolproof and may not always give a clear indication of the optimal number of clusters. Therefore, it’s often useful to try multiple methods and compare the results to choose the best value of k.

The next thing to do is to initialize the centroids, there are different ways to initialize the k centroids in this algorithm, including:

  • Random initialization: k centroids are randomly selected from the data points. This is a simple and commonly used method, but it may result in suboptimal clustering if the initial centroids are not representative of the data distribution.

  • K-means++ Initialization: aims to select k centroids that are far apart from each other and representative of the data distribution. It involves selecting the first centroid randomly from the data points and then selecting subsequent centroids based on the distance from the previously selected centroids. This method typically results in better clustering performance than random initialization.

  • Manual Initialization: in some cases, the user may have prior knowledge about the data and the expected clusters, and can manually specify the initial centroids.

Notice that the choice of initialization method can affect the clustering result, so it’s often recommended to run the algorithm multiple times with different initializations and choose the best result.

Once we have our initialization method defined, we can start with our iterative process which consists of calculating the distance between the points and each centroid, assigning the points to each cluster, and updating the centroid positions.

For each data point in the dataset, the algorithm calculates the Euclidean distance between the point and each centroid. The Euclidean distance is simply the straight-line distance between two points in a Euclidean space, such as a two-dimensional plane. This metric is more used because of its simplicity to compute, and it's also an intuitive distance metric that can be easily understood and visualized.

Moreover, the Euclidean distance is suitable for continuous data and mathematical models. However, there are cases where the Euclidean distance may not be appropriate, such as text clustering problems. In this situation is commonly used the cosine distance metric, this one measures the angle between two vectors.

The choice of distance metric depends on the nature of the data and the problem at hand. It is always a good practice to explore different metrics.

Once the distance is calculated, the algorithm assigns each data point to the cluster with the closest centroid.

After this step, the algorithm recalculates the centroid positions, which represents the mean of all data points assigned to each cluster. Next thing to do is repeat this iterative process until convergence is reached. It is achieved when the assignment of data points to clusters no longer changes or when the change is below a predefined threshold.

The final output of the algorithm is a set of k clusters, each represented by its centroid, and a label for each data point indicating its assigned cluster. Finally, describe how to evaluate the quality of the clustering result using metrics such as the within-cluster sum of squares of silhouette score.

After providing an overview of the k-means algorithm, it’s important to discuss its strengths and limitations, understanding these is important for making informed decisions about its use in different applications.

Among its benefits we can include the following:

  • It’s computationally efficient and suitable for large datasets, this is because the algorithm only requires a few simple computations for each iteration, making it a suitable choice for clustering tasks where efficiency is an important consideration.

  • It is easy to understand and implement, due to not requiring advanced mathematical or statistical knowledge. Making it accessible to practitioners with varying levels of expertise in data science and machine learning.

  • It can handle data with a large number of dimensions. K-means is able to find patterns and structure in high-dimensional data, making it a valuable tool in many applications.

However, K-Means is not without its limitations, including:

  • The algorithm relies on the initial selection of centroids, which can affect the final clustering results. As we previously discussed, it is recommended to run the algorithm multiple times with different initializations so it can help to mitigate, not eliminate, this limitation.

  • K-means assumes that the clusters are spherical, which can lead to incorrect cluster assignments when clusters are non-spherical. In real-world datasets, clusters can have complex shapes and structures that do not fit the spherical assumption of k-means. In these cases, more advanced clustering algorithms such as density-based clustering or hierarchical clustering may be more appropriate.

  • The algorithm struggles with identifying clusters of varying sizes and densities. This is because the algorithm assigns data points to the closest centroid, which can result in one large cluster and several small clusters.

Overall, understanding the limitations of k-means is important for making informed decisions about when and how to apply the algorithm. It is good to notice that despite these limitations, K-Means remains one of the most widely used clustering algorithms because of its simplicity and efficiency and has not deterred its use in various domains.

The K-Means algorithm has several applications in various disciplines, including:

  1. Market segmentation: K-means clustering is often used in marketing to segment customers based on their behavior, preferences, and demographics. By grouping customers with similar characteristics, companies can tailor their marketing strategies to each segment and improve customer satisfaction and loyalty.

  2. Image segmentation: segmentation of images based on their color or texture features. This technique is commonly used in image compression, object recognition, and image retrieval.

  3. Anomaly detection: it can be used for anomaly detection in various fields, such as finance, cybersecurity, and fraud detection. By clustering normal data points and identifying outliers that do not belong to any cluster, k-means can help detect unusual patterns that may indicate fraudulent or suspicious activity.

  4. Bioinformatics: clustering of genes, proteins, or samples based on their expression levels or sequence similarity. This technique can help identify patterns in large biological datasets and enable researchers to study the relationships between different biological entities.

  5. Social Network Analysis: K-means clustering can be used in social network analysis to cluster users based on their behavior, interests, or social connections. By identifying groups of users with similar characteristics, researchers can gain insights into the structure and dynamics of social networks and predict user behavior.

While K-Means is a very effective algorithm with plenty of applications, it may not be suitable for some situations like categorical data, as previously said. In the same way, there could be some flaws that led to the development of variants and extensions of this algorithm, including K-Modes.

K-Modes is a clustering algorithm that is specifically designed for categorical data and is based on the same principles as K-Means. In this way, the algorithm represents an important extension and highlights the ongoing development of clustering techniques to meet the diverse needs of researchers and practitioners.

There are several variants and extensions of the K-Means algorithm that have been proposed. Some examples are K-Medoids, Fuzzy C-Means, and K-Prototype. The first one replaces the mean calculation with the selection of a representative data point from each cluster, known as a medoid, making it more robust to outliers and noise in the data.

Fuzzy C-Means assigns a degree of membership to each data point for every cluster. This allows for more nuanced clustering and can be useful when there is uncertainty or overlap between clusters. For example, in image segmentation, a pixel may belong to multiple regions with different colors, and this type of clustering can provide a more accurate representation of the underlying structure of the data.

Finally, the K-Prototype extension is a hybrid algorithm that combines both K-Means and K-Modes to cluster datasets with both numeric and categorical data. It assigns a weight to each feature based on its type and uses this to calculate the distance between data points.

These variants and extensions demonstrate the ongoing efforts to improve and adapt clustering algorithms to better suit the needs of different applications and types of data.

Python Tutorial

To ensure compatibility, it is recommended to use an Anaconda distribution for this tutorial. However, if you don’t have Anaconda installed and you want to use your trusted Kernel, you can manually install the required packages using pip. You can execute the provided code block by uncommenting from the “import sys” line onwards to automatically install the necessary packages.

To perform this tutorial, we need to import several essential libraries in Python.

We import NumPy, a powerful library for numerical operations on arrays, which provides efficient mathematical functions and tools. Next, we import pandas, a widely used data manipulation and analysis library that allows us to work with structured data in a tabular format.

To visualize the results of our clustering analysis, we import matplotlib.pyplot, a plotting library that enables us to create types of charts and graphs. This will help us understand the patterns and relationships within the data.

For the actual clustering process, we import scikit-learn’s KMeans module. To ensure accurate results, we also import the StandardScaler module from scikit-learn’s preprocessing submodule. It is used for feature scaling, which helps to normalize the data and improve the performance of the clustering algorithm.

Lastly, we import silhoutte_score from scikit-learn’s metric module. The Silhouette Score is a metric used to evaluate the quality of the clustering results. It measures how well each data point fits within its assigned cluster.

Next, we’re using the housing data from a CSV file called “housing.csv”, you can reach this file in the following Kaggle page: California Housing Prices | Kaggle. We specify the columns of interest (longitude, latitude and median_house_value).

Then we remove any rows that have missing values (NaN). This ensures that we are working with a clean and complete dataset for further analysis.

Now we import the seaborn library to create a scatter plot. We indicate to the function the axis values and the hue, which will help us understand the relationship between the longitude and latitude coordinates of the housing data with the median house value. It displays the following scatter plot.

We create an instance of the StandardScaler class from scikit-learn’s preprocessing module. This will help us to normalize the data and bring it to a standard scale.

When we perform the fit.transform, the method calculates the mean and standard deviation of each feature in the dataset and applies the scaling transformation accordingly. We save this value in the data_scaled variable.

By scaling the features, we ensure that they have a similar range and variance, which can be beneficial for certain machine learning learning algorithms and data analysis techniques.

In this step, the initialize an empty list called “silhouette_scores” to store the scores. Then, we iterate through the range of k values from 2 to 10. For each value of k, we create an instance of the KMeans class with k clusters and fit the scaled data to the model.

Next, we calculate the silhouette score for the clustered data using the silhouette_score function, which measures the quality of the clustering results. The resulting score is appended to the silhouette_score list.

Finally, we plot the silhouette scores against the values of k, where the x-axis represents the number of clusters (k), and the y-axis represents the silhouette coefficient. The plot is labeled with appropriate axes labels and a title, and displayed.

After plotting this graph of the silhouette coefficient for different values of k, we can analyze the results to determine the optimal number of clusters for our data. We need to identify the value of k that corresponds to the peak or highest silhouette coefficient on the graph. This will be the number of clusters that yield the most distinct and well-separated groups within the data. In this case, k equals the number 2.

We set the number of clusters (k) to 2, partitioning the data into two distincts groups. We then create an instance of the KMeans class with the specified number of clusters. We use K-Means++ as our init method, which is widely used and helps improve the convergence of the algorithm. Additionally, we set a random state of 42 to ensure reproducibility of the results.

After that, we fit the scaled data to the KMeans model using the fit() method. This process calculates the cluster centroids and assigns each data point to its corresponding cluster based on the proximity of the centroids.

Finally we obtain the clusters and centroid labels and set the scatter plot labels to show our graph.

The cluster centroids are marked as red X’s. For furthermore exploration of this algorithm, you can visit our GitHub Repository where you can access the complete code for convenient execution and customization.

In conclusion

K-Means is a popular clustering algorithm in machine learning that aims to partition data points into clusters based on their similarity. It is an unsupervised learning technique that can find patterns in data.

The algorithm works by iteratively assigning data points to the cluster with the closest centroid and updating the centroids based on the assigned points. It continues this process until convergence.

K-Means requires specifying the number of clusters (k), providing the data to be clustered, and initializing the centroid. Determining the optimal value of k can be done using methods like the elbow method, silhouette method, or gap statistic method.

This technique has strengths such as computational efficiency, ease of implementation, and the ability to handle high-dimensional data. However, it has limitations such as sensitivity to initial centroid selection and the assumption of spherical clusters.

Evaluation of the clustering result can be done using metrics like the within-cluster sum of squares or silhouette score. K-Means finds applications in market segmentation, image segmentation, anomaly detection, bioinformatics, and social network analysis.

Are you eager to delve deeper into the fascinating world of machine learning and explore more powerful algorithms like K-Means? If so, I invite you to continue your learning journey and unlock the potential of this exciting field. You can explore our other tutorials and resources to provide in-depth explanations and practical examples that can guide you step-by-step through implementing Machine Learning algorithms.

Additionally, consider joining our community, where you can engage with like-minded individuals, exchange insights, and collaborate on different kinds of projects. The collective wisdom and support can enhance your learning experience and open doors to exciting opportunities. The possibilities are waiting for you, start your journey now!

References

Babitz, K. (2023). Introduction to k-Means Clustering with scikit-learn in Python. https://www.datacamp.com/tutorial/k-means-clustering-python

K-Means Clustering Algorithm - JavatPoint. (n.d.). www.javatpoint.com. https://www.javatpoint.com/k-means-clustering-algorithm-in-machine-learning

Li, Y., & Wu, H. (2012). A clustering method based on K-Means algorithm. Physics Procedia, 25, 1104–1109. https://doi.org/10.1016/j.phpro.2012.03.206

Mannor, S., Jin, X., Han, J., Jin, X., Han, J., Jin, X., Han, J., & Zhang, X. (2011). K-Means clustering. In Springer eBooks (pp. 563–564). https://doi.org/10.1007/978-0-387-30164-8_425

T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko, R. Silverman and A. Y. Wu, "An efficient k-means clustering algorithm: analysis and implementation," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 881-892, July 2002, doi: 10.1109/TPAMI.2002.1017616.

Comment

Decolonizing AI Ethics: Indigenous AI Reflections

TLDR

This article highlights a quote about bringing an end to colonization and elaborates on how we can imagine AI bringing on this end and starting new beginnings for a sustainable future. We will establish the importance of citing Indigenous thinkers in all spaces, in an effort to decolonize without repeatedly colonizing, and how this is something that is crucial to my own research. Then we begin to discuss the challenges we are up against in the Western techno-utilitarian landscape and the importance of a mindset shift. Following that, some examples of how AI could potentially help democracy are introduced in order to show its true potential.

This is a call to action for readers to get involved in decolonizing ethics in AI. The action starts in our own thoughts and behaviors, and in how we interact with and use AI. If we can consider the possibilities of AI weaving into the goals of inclusivity and sustainability, celebrating diversity, and promoting democracy, we can bring it into reality. It does take work. If nothing is done, things will continue as they are.

Imagining Decolonization with AI

“¿Por qué podemos imaginar el fin del mundo pero no el final del colonialismo?” (Indigenous Action Media, 2022)

English translation: “Why can we imagine the end of the world but not the end of colonization?”

This quote from a little book on ancestral futurism from Indigenous Action Media stood out to me, but I want to take it even further as it is relevant to the current landscape of decolonizing ethics in AI:

Why can we imagine AI bringing on the end of the world, and not imagine AI bringing on the end of colonization?

As a writer, I have been experimenting with AI chatbots, specifically Microsoft’s Bing, and I thought it would be interesting to engage with it on this topic. Here is what Bing came up with after I asked it: “Can you help me with a blog post around this theme: Why can we imagine AI bringing on the end of the world, and not imagine AI bringing on the end of colonization?” This is what the AI said:

“The reason why we can imagine AI bringing on the end of the world and not imagine AI bringing on the end of colonization is because of the way we think about AI. We tend to think of AI as a tool that can be used for good or bad purposes. When we think about AI bringing on the end of the world, we are thinking about how it could be used for bad purposes. For example, if an AI system were to become self-aware and decide that humans are a threat to its existence, it could take actions that would lead to the end of the world.

On the other hand, when we think about AI bringing on the end of colonization, we are thinking about how it could be used for good purposes. For example, if an AI system were to be used to help colonized people gain their independence, it could help bring an end to colonization.

It’s important to remember that AI is just a tool and it’s up to us to decide how we want to use it. If we use it for good purposes, it can help us solve some of the world’s biggest problems. If we use it for bad purposes {or without care} it could lead to disastrous consequences.”

Here are my thoughts: First of all, I think negligence and speed of development are just as dangerous as bad actors. (I added in the {or without care} part.) There is a lack of accountability and responsibility in the air around AI, and this needs to be addressed. Secondly, I noticed that Bing summarized that it is a problem to think of AI as a tool, which is similar to Jason Edward Lewis’ piece on making kin with machines. It is vital to avoid enacting a master/slave relationship with AI. Lewis, Indigenous Future Imaginary and professor, argues for an alternative kinship with AI, and a relationality as opposed to a rationality. (Lewis, 2018)

The concept of kinship with AI is hard for someone like me to understand, as a mixed white person from Northeastern USA, coming from generations upon generations of colonization. But I think this is what we need to strive to understand. Because creating a master/slave relationship with humans being the masters and the machines being the slaves should not be the goal.

This is a call to action: to change the way we relate to AI. It is a mindset shift at its core. With no intervention, there will be more of the same. Check out my last blog The Precarious Human Work Behind AI for more on the dark side of the rapid growth of AI and the hidden ways it is affecting workers.

As we continue to develop AI and data science technologies, it’s important that we consider the ethical implications of these systems. While many people are quick to imagine AI bringing on the end of the world, few consider that these systems could bring about the end of colonization. In this blog post, we explore some of the key issues surrounding AI and data science ethics and discuss how we can work towards a more equitable future.

The Importance of Citing Indigenous Thinkers

In the ongoing process of decolonization, Indigenous peoples and their expansive knowledge systems (aka data) are at the core. Indigenous people have always been at the center of the fight against colonization. They have always been imagining futures in a sustainable manner, aware of the balanced systems of interacting with each other and the earth. We need to be considering how to work with both new technologies and traditional knowledge. Indigenous wisdom needs to be highlighted, as well as the acknowledgment of the centuries-long battle against oppression that Indigenous peoples have faced.

AI runs on data, we are in the age of information. It doesn’t have to be a future that keeps repeating colonialism.

Over the past year, I have been a part of a research group at the Berkman Klein Center for Internet and Society at Harvard, and for our last meeting, we met with another group, Indigenous in AI. Some of the readings done as a group I have included in this post. In our joint meeting, we got to hear directly from Jason Edward Lewis and other representatives from Indigenous AI, which made for an enlightening and rich conversation.

A strong point that I want to carry through is the importance of centering on and citing Indigenous knowledge in respectful ways. One of the readings I have done recently was by Zoe Todd, a Métis anthropologist. I will repeat her sentiment and say “I cannot, nor would I try, to speak for Indigenous thinkers.” (2016, p.14)

In Todd’s article, she gives an example of being in a European lecture hall and hearing indigenous thinkers being left out and not cited for their original ideas, and what is worse is that instead, ideas are filtered through white intermediaries. This is incredibly difficult to track and prove, as it takes looking for what is not there. There isn’t any data on the number of Indigenous researchers or tech workers who work in AI, but the data on diversity in these and other STEM fields shows that we still have a long way to go toward diversity and inclusion.

Todd stressed the importance of citing and quoting Indigenous thinkers directly and generously, and I would say that this needs to be done everywhere, in tech and business as well as academia. The author argues that Indigenous thinkers should be cited as thinkers in their own right, not just as representatives of an amorphous Indigeneity that serves European intellectual or political purposes. They should also be cited as dynamic philosophers and intellectuals, not just as research subjects or collaborators. By doing so, we can help to decolonize intellectual spaces and create a more inclusive and equitable academic environment. (Todd, 2016 P.7) Her piece An Indigenous Feminist’s Take On The Ontological Turn: ‘Ontology’ Is Just Another Word For Colonialism is an enlightening and refreshing read which I would recommend.

Working in this intellectual space and being non-indigenous is tricky, but it is what has always felt the most authentic to me. My goals as a researcher include supporting Indigenous thinkers and knowledge and crediting them as much as I can. I don’t want to take the credit for indigenous thought, and I will ensure it doesn’t happen. I hope that people who read my work and want to cite it will cite the indigenous creators whom I reference. I think it is important to get the ideas out there, and the trend toward creating sustainability across sectors is a good trend overall. Sustainability is not a Western idea, but something that we need to fight a lot of Western ideals to achieve. Sustainability requires a change of mindset and new ways of thinking and doing things, and willingness to change. The fastest changing things around are in technology, especially AI. If we want to really change things and go in a more sustainable direction, we must take action, and this is where I focus my research.

The Importance of Understanding AI Systems Beyond a Western Technoutilitarian Lens

In the fabulous ‘Indigenous Protocol and Artificial Intelligence Position Paper’, Lewis explains the importance of understanding AI systems beyond a Western techno utilitarian lens. Lewis argues that if we only think about these systems through a Western lens, we will not fully grasp what they are and could be. “At best, we risk burdening them with the prejudices and biases that we ourselves still retain. At worst, we risk creating relationships with them that are akin to that of a master and slave.” (Lewis, 2020 p.6) As I stated above, I am convinced that we must avoid falling into master/slave style relations with AI, especially as AI gets more human-like. If we want to create an inclusive, equitable, and sustainable world, we need to put this dynamic fully in the past.

When I asked Bing’s AI to summarize Lewis’s work, it came out with something that was very black and white as far as getting AI to be used for good versus bad purposes. Yes, looking out for bad actors is important. But the issues currently and continually tend to be more dangerously subtle, like reproducing biases that put minorities at unfair disadvantages.

It isn’t simply controlling the systems to be used for good purposes and not bad; rather, it is changing the mindset around the relationships between humans and machines. Think about your relationship with your car, moto, or bicycle. This is a machine that you own and it is a tool to get you from place to place. Is it your slave? Do you treat it that way? Don’t you take care of it, change the tires, check the brakes? Doesn’t it take care of you? Don’t you go through good times and sometimes hard times together? Don’t you respect your vehicle, and understand its utility and also its danger, since it is dangerous out on the road? Sometimes we name them, personify them, and anthropomorphize them. Think of this relationship, and then think of it with something that is made to be human-like, something you can talk to in a natural way; but is still a tool, a machine. We need to be seriously thinking about how we think about AI.

Lewis and his coauthors pointed out that the assumptions we are currently making are getting “baked into the core material of our machines, fundamentally shaping the future for decades hence.” (Lewis et al., 2018) We could take this as a warning, or as an opportunity. But we cannot be lazy and expect technology to change things on its own. People need to step in and change the ingredients to be baked in, to create a more inclusive and sustainable world.

The Potential Benefits of AI

What would the world look like if AI followed value systems which truly promoted the betterment of all, instead of racing for profits and for the sake of speed itself? The potential is amazing, even for helping democracy.

There are many possibilities of AI being used for public benefit, and to help democracy, not hurt it. If used correctly, it could plausibly help democracy by:

-Educating citizens

-Assisting with group deliberations

-Summarizing what people think

-Finding possible solutions (Scheier et al., 2023)

Large language models could potentially aid politicians in better understanding their citizens’ needs. (Scheier et al., 2023) We know that data science and AI can greatly influence efficiency and effectiveness, not to mention the speed of getting things done. However, as many have pointed out, “that same speed and efficiency can also magnify social inequities and public harms.” (Bruce, 2023)

These are just some examples of what the future of AI could look like if it were actually geared toward public benefit. What would it look like for AI to align more with Indigenous values? This is the question that needs to be asked as we move forward.

References

Bruce, A. B. (2023, May 26). How Public Interest Tech Principles Can Shape the Future of Data Science and Artificial Intelligence — PITcases. PITcases. https://pitcases.org/portfolio/pit-future-of-data-science-ai-afua-bruce/

Lewis, J. S., Arista, N., Pechawis, A., & Kite, S. (2018). Making Kin with the Machines. Journal of Design and Science. https://doi.org/10.21428/bfafd97b

Lewis, Jason Edward, ed. 2020. Indigenous Protocol and Artificial Intelligence Position Paper. Honolulu, Hawaiʻi: The Initiative for Indigenous Futures and the Canadian Institute for Advanced Research (CIFAR).

Schneier, B., Farrell, H., & Sanders, N. E. (2023, April 21). The Tech Investment We Should Make Now to Avoid A.I. Disaster. Slate Magazine. https://slate.com/technology/2023/04/ai-public-option.html

Todd, Z. (2016). An Indigenous Feminist’s Take On The Ontological Turn: ‘Ontology’ Is Just Another Word For Colonialism. Journal of Historical Sociology, 29(1), 4–22. https://doi.org/10.1111/johs.12124

The Precarious Human Work Behind AI

AI is now everywhere, but it doesn’t exist as autonomously as it makes it seem. AI is increasingly prevalent in a large variety of industries, many which hide the countless workers behind the curtain making it function, and I am not just talking about the engineers who create it.

It is important to acknowledge the human work behind AI development and maintenance, from grueling content moderation to rideshare drivers to all of us whose data serves to profit large corporations. This leaves countless workers in precarious positions, stuck in survival mode and forced to adapt as best as they can, with low wages and the threat of job loss looming as tasks continue to be automated.

Anything done in the name of ‘safety and trustworthiness’ of AI is truly an afterthought to capital gain for corporations. In a podcast with engineers from Open AI, they were laughing about how ‘Trust and Safety’ (T&S) more so stands for ‘Tradeoffs and Sadness.’ This is a fundamental problem for multiple reasons. Here in this blog, we will discuss the areas where the rapid development and deployment of AI is affecting precarious work in various ways.  

The Human Work Behind Data

Data is the foundation of AI and is generated by people. Each day, approximately 328.77 million terabytes of data are created. The work done to produce data is almost never compensated, although it is massively profited off of by large corporations. How could companies compensate their users for the data that they use and profit from? What kind of laws or policies could be created to solve this problem, and how would it work on a global scale? These are still questions that we are grappling with as a society. 

Data is the fuel of AI. There is a stark lack of control and ownership over data, which brings up some serious ethical considerations which include but are not limited to privacy, and which are barely covered by inconsistent and often unenforced data protection laws.  

What should be done about this aspect of human work behind AI? This could be seen as  a form of ghost work. Should it be compensated? How would this be implemented? There are some companies which are taking some initiatives in this and paying very small amounts to users for their data, but the issue is much bigger than that. The data collected is used to target advertising at users, which means further exploitation. Not to mention that it can be used to feed AI that replaces human work, so that your own data which you aren’t paid for could be used to put you out of a job, while also being used to sell you things. 

In 2017, it was estimated that the transaction of giving up personal details to companies like Facebook came to about $1,000 per person per year, but this is quickly rising. (Madsbjerg, 2017) The exact value of our data is unclear, even to Google, but is often used for targeted advertising, as well as being sold to data brokers who sell it as a commodity to advertisers, retailers, marketers, government agencies, and other data brokerages. According to a report by SecurityMadeSimple.org, the data brokerage industry generates over $200 billion of revenue yearly and continues to grow annually. Another report by MAXIMIZE MARKET RESEARCH states that the Data Broker Market size was valued at $257.16 billion in 2021 and the total Data Broker revenue is expected to grow at 4.5% from 2022 to 2029, reaching nearly $365.71 billion. When will we as users and providers of data ever see any of these profits? 

One proposed answer would be universal basic income based on the data we produce. This idea is not  new, and was first presented by Jaron Lainer in his 2013 book, Who owns the future? The book criticizes the accumulations and evaluation of consumer data in the tech industry which fails to acknowledge any monetary debt to the people for all this free information they create and give. 

The Exploitation of Workers in AI Moderation and Content Labeling

Now, we will leave that can of worms crawling around and discuss the low-paid gig work that goes into moderating AI systems, such as scanning content for violence and hate speech or endlessly labeling data. These jobs are often outsourced to workers in the Global South who are repeatedly exposed to traumatic content and receive little compensation. This is highly exploitative work, with little room for workers to organize and demand worker’s rights. 

Take for example the story of Sama, which claims to be an “ethical AI” outsourcing company. Sama is headquartered in California and handles content moderation for Facebook. Its Kenya office pays its foreign employees a monthly pre-tax salary of around $528, which includes a monthly bonus for relocating from elsewhere in Africa. After tax, this amounts to around $440 per month. Based on a 45-hour work week, this equates to a take-home wage of roughly $2.20 per hour. Sama employees from within Kenya who are not paid the monthly relocation bonus receive a take-home wage equivalent to around $1.46 per hour after tax. (Perrigo, 2022) 

Time published a report  on Sama which detailed a failed worker uprising. The workers faced the trauma of viewing hundreds of horrific pieces of content every day, with the goal of determining if they were Facebook appropriate within 50 seconds for each, while living hand-to-mouth on low salaries and not given the appropriate support needed for this PTSD-inducing job. When workers organized in protest and planned a strike, high-paid executives flew in from San Francisco to ‘deal’ with the situation. They isolated the spearheader of the worker’s alliance, and terminated him, making him look like the bully who forced 100 other workers to sign a petition against the company. (Perrigo, 2022) The real bullies got away with this, as the ultimate goal is to make Facebook happy. It suits them to have low-waged workers with no other options to suffer life-long trauma everyday, all day long. But these workers need fair pay and worker’s rights. They need real support for their labor which is what makes Facebook a safer space, with less hate speech and violent content. They deserve to have a voice. 

Another example is Mechanical Turk, or MTurk, which is a marketplace for human intelligence micro-tasks which are extremely low-paid (with no guarantee of pay), not to mention poor labor protection and high exploitation, and involves tasks such as tedious image labeling. As of December 2019 MTurk’s workers’ portal had 536,832 visitors, and although the work is demoralizing and pays pennies, many depend on it over no work at all. (Mehrotra, 2020) MTurk has been functioning since 2005, still with no worker protections. 

The Human Intervention Required for AI Systems Case Studies: Spotlight on the Global South

Taking a deeper peek behind the curtain, we see that AI systems often require unseen human intervention and workarounds to operate effectively. This goes beyond the desks of technologists, and drives through the streets of nearly every city. 

One study looked into the operations of two startups, Gojek and Grab, which entered Jakarta in 2015 with the aim of digitizing the city’s motorbike taxi market. (Qadri, & D’Ignazio, 2022) They found that the platform’s view of the city is idealized and flattened, with no consideration for frictions such as traffic, parking delays, or blocked roads. The routes assigned to drivers are often inappropriate or dangerous due to the platform’s lack of consideration for these variables, which local drivers develop work-arounds for that remain invisible and unacknowledged by the platforms. The drivers know the safest ways through their own city, despite what the app says. 

The authors compared this to Donna Haraway’s “god trick” (1988) because it places the viewer in the impossible position of a disembodied, all-knowing eye looking down at the city. (Qadri, & D’Ignazio, 2022) The startups’ discourse often casts technology as the central organizer and optimizer of activity, while other forms of (human) intelligence are considered inferior. And to further demonstrate the dehumanization at play, Grab’s blog refers to drivers as “supply” units that can be moved around like goods or trucks. (Garg, et al., 2019) In reality, it is the human drivers who have knowledge of the city in its ever-changing state which makes the taxi service work, but the “AI” technology gets all the credit and the company owners benefit the most profit.

Workers rights remain an issue for lots of new areas of precarious occupation behind AI. As stated in a paper on work regulations for platform food delivery workers in Colombia, a neoliberal discourse on entrepreneurship is deepening the crisis of platform workers who are characterized as “self-employed” and therefore excluded from employment rights guaranteed for “employed workers” in local labor legislation. (Wood et al., 2019) (Vargas et al, 2022,  p..38)

What is desperately needed are people to care about people. AI has no capability of systems to actually care about people, even if it were based on human systems that did. Algorithms are programmed with the ultimate goal to promote business. This leads to human workers being treated more and more like machines. With humans working under control of algorithms, digital workers are excluded from the benefits of the value chain in which they are one of the most important subjects. (Vargas et al, 2022 p.34)

Discussion

In a Harvard Business Review article on the subject of the humans behind the curtain of AI, the authors spoke of the paradox of automation’s last mile, which is the ever-moving frontier of AI’s development. (Gray & Suri, 2017) This is all the more relevant today. As AI makes progress, it creates and destroys temporary labor markets for new types of humans-in-the-loop tasks at a rapid pace

Contract workers are needed to train algorithms to make important decisions about content. They are also responsible for making snap decisions about what stays on a site and what’s deleted. This is a new form of employment that should be valued. (Gray & Suri, 2017) However, this work is not only still largely invisible, but the workers are not valued and the work is unreliable, low-paid, and often traumatizing. 

Adrienne Williams, Milagros Miceli and Timnet Gebru wrote an essay late last year which argued that the idea of a world where AI is the primary source of labor is still far from being realized. The push towards this goal has created a group of people who are performing what is called “ghost work”, a term introduced by anthropologist Mary L. Gray and computational social scientist Siddharth Suri. This refers to the human labor that is often overlooked and undervalued but is actually driving AI. Companies that have branded themselves as “AI first” rely heavily on gig workers such as data labelers, delivery drivers and content moderators who are underpaid and often subject to heavy surveillance. (Williams, Milagros and Gebru, 2022)

Recommendations from Williams, Milagros and Gebru:

  1. Funding for research and public initiatives which highlight labor and AI issues.

  2. Analysis of causes and consequences of unjust labor conditions of harmful AI systems.

  3. Consideration for the use of precarious crowdworkers to advance careers of AI researchers and practitioners and shift power into the hands of workers.

  4. Co-create research agendas based on worker’s needs.

  5. Support for cross-geographical labor organizing efforts.

  6. Ensuring that research findings are accessible to workers rather than confined to academic publications. 

  7. Journalists, artists and scientists can foster solidarity by drawing clear connections between harmful AI products and labor exploitation. (Williams, Milagros and Gebru, 2022)

Recommendations from Gray and Suri:

  1. Require more transparency from tech companies that have been selling AI as devoid of human labor.

  2. Demand truth in advertising with regard to where humans have been brought in to benefit us.

  3. Recognize the value of human labor in the loop.

  4. Understand the training and support that informed their decision-making, especially if their work touches on the public interest. (Gray & Suri, 2017) 

Conclusion

I can’t stress enough the importance of acknowledging the human work behind AI. There is a need to ensure that those who contribute to the development of AI are fairly compensated and protected. When trust and safety are dismissed as tradeoffs and sadness, with no question that the ends might not be justifying the means, there are some fundamental changes necessary to the approach on this. We might even question the end goals while we are at it. 

We need to be humanized. It is arguable that AI was started back in the day to try to eventually replace human slavery. This is inherently problematic, as master/slave relations are built on exploitation, subjugation and dehumanization, which extends to the workers behind AI and not just to the AI itself. Although there are many benefits to AI replacing, changing, or accompanying work, it must be done in a way that is not exploitative and is centered on the betterment of all people and the planet, not in a speed-race for AI. 

While AI has the potential to revolutionize many industries, it is important to acknowledge the human work that goes behind its development and maintenance. From data collection to system maintenance, humans play a critical role in the AI ecosystem. It is essential that we recognize and value this work, and understand the real harms that are already happening around AI. 

It is easy to have a lot of fear for what AI can bring, how many jobs it is going to take. The reality is that most jobs will need to adapt to AI, and also that AI is creating so many new jobs at various skill levels. This would be much better news if it was something that everyone could benefit from, instead of being a product of exploitation and techno-solutionism. 



Sources 

Garg A, Yim LP and Phang C (2019) Understanding Supply & Demand in Ride-hailing Through the Lens of Data. In: Grab Tech. Available at: https://engineering.grab.com/ understanding-supply-demand-ride-hailing-data (accessed 6 October 2021).

Gray, M. L., & Suri, S. (2017). The humans working behind the AI curtain. Harvard Business Review. https://hbr.org/2017/01/the-humans-working-behind-the-ai-curtain

Haraway, D. (1988). Situated knowledges: The science question in feminism and the privilege of partial perspective. Feminist Studies, 14(3), 575-599.

Fagen, R. (2023). GPT4: Eldritch abomination or intern? A discussion with OpenAI — Integrity Institute. Integrity Institute. https://integrityinstitute.org/podcast/trust-in-tech-e19-eldritch-open-ai-gpt

Lanier, J. (2013). Who Owns the Future? Simon and Schuster.

Mehrotra, D. (2020, January 28). Horror Stories From Inside Amazon’s Mechanical Turk. Gizmodo. https://gizmodo.com/horror-stories-from-inside-amazons-mechanical-turk-1840878041#:~:text=The%20workers%20of%20Mechanical%20Turk,numbers%20and%20other%20personal%20data

Perrigo, B. (2022, February 17). Inside Facebook’s African Sweatshop. Time. https://time.com/6147458/facebook-africa-content-moderation-employee-treatment/

Qadri, R., & D’Ignazio, C. (2022). Seeing like a driver: How workers repair, resist, and reinforce the platform’s algorithmic visions. Big Data & Society, 9(2), 205395172211337. https://doi.org/10.1177/20539517221133780

Should tech companies pay us for our data? (2022, May 20). World Economic Forum. https://www.weforum.org/agenda/2018/12/tech-companies-should-pay-us-for-our-data/

Vargas, D. S., Castañeda, O. C., & Hernández, M. R. (2022). Technolegal Expulsions: Platform Food Delivery Workers and Work Regulations in Colombia. Journal of Labor and Society, 1–27. https://doi.org/10.1163/24714607-bja10009

Wood, A.J, Graham, M., Lehdonvirta, V. and Hjorth, I. “Good Gig, Bad Gig: Autonomy and Algorithmic Control in the Global Gig Economy.” Work, Employment and Society 33(1) (2019), 56–75. https://doi.org/10.1177/0950017018785616.

Williams, A., Miceli, M., & Gebru, T. (2022, December 10). The Exploited Labor Behind Artificial Intelligence. NOEMA. https://www.noemamag.com/the-exploited-labor-behind-artificial-intelligence/

Comparative analysis: Ubuntu Collectivism versus Western Ethics in AI development

When AI technologies affect everyone globally, wouldn’t it be nice if they were built with the collective in mind?

In my last blog, I introduced the African collectivist philosophy of Ubuntu and how it could be applied to Artificial Intelligence (AI) ethics for the benefit of all, based on the works of Mhlambi in 2020 and Gwagwa in 2022. The word ubuntu means “human-ness” or “being human” in Zulu and Xhosa languages spoken in South Africa and Zimbabwe respectively. Here I dig deeper into some of the key concepts of Ubuntu which either parallel or stand in opposition to Western ethics such as utilitarianism, and addresses the flaws of individualism and why we should move away from it. 

What draws me personally to Ubuntu as an ethical theory for AI Governance 

Learning about Ubuntu was a breath of fresh air, as Western ideals such as individualism never sat well with me. I confronted individualism in my master’s thesis research, but didn’t come across Ubuntu until rather recently, in connection to my work in ethics in AI. It is exactly what I was looking for: an alternative ethical system which relates personhood to how we are all connected, that a person is a person through other people. It relates to mutual aid (Kropotkin, 1902) and the sense of care in the grand sense of care, caring about how everything affects everything, not just for oneself. The idea that this level of care and collectivism could be applied to AI ethics blew me away, and the papers I have read on it, especially one by Sabelo Mhlambi, really drove this home. 

A snippet of my story 

Nearly five years ago, I chose to leave the Western world behind and live in South East Asia, after also spending time in Western Africa. My decision was fueled by the distasteful air of individualism in the West, which promotes greed and putting others down to get ahead. No amount of personal comfort could erase that ever present feeling of disconnection I feel when in the US, Europe or Australia. When I visit my hometown, everyone always asks me, why do I live so far away? It is a difficult question to answer, but I think it comes down to this notion of the isolation caused by individualism that puts everyone in toxic competition with each other and in situations where your success means that you are climbing over others. I look around and see constant valuing of profit over life. The fact that AI has been born from this ideology is extremely problematic, as it has this baseline of individualism built in. 

From my travels and living abroad, I have seen that the world is rich with diversity, and that diversity is a beautiful thing and should be celebrated, not discriminated against. White men are not actually the majority in the world, as much as everyone else is marginalized and minoritized. Women are minoritized, and we are over half of the population. The world has been running on systems that make zero sense. As we breathe life into artificial intelligence, it is overdue that we have a re-haul on how we relate to one another and the world around us. It is time that we turn to non-Western-centric ideals and embrace the diversity of the world when deploying technologies that affect everyone globally. 

The rest of this article will engage more deeply with Mhlambi’s work on utilizing Ubuntu as an ethical framework for AI governance moving forward, something I endorse completely. 

Ubuntu: an African value of the collectivism of communities

Alternative ethics systems such as Ubuntu are not currently included in the exclusive discourse on ethics and AI. The default is Western ethics, which are burdened with individualism and greed, and are not adequate to address technological and algorithmic harms. (Mhlambi,  2020 p. 23) Individualism and greed also stand in opposition to Ubuntu’s foundations of interconnectedness, empathy, and generosity. (Mhlambi,  2020 p. 24) These are the values that AI development would benefit from immensely, which would make individualistic values irrelevant. How can this be implemented for the governance of AI?

Ethical Leadership: Ubuntu promotes cooperation and helping each other

Ethical governance requires a closer look at leadership. Cooperation and participation are requirements for Ubuntu, particularly when it comes to leadership, as it rejects elite power concentrations. (Mhlambi,  2020 p. 15-16) The current leadership in AI consists of concentrations of power amongst a few elites, which could be something that gets in the way of Ubuntu truly working. One Ubuntu quote “Inkosi yinkosi ngaba-Ntu” translates to “A leader derives power from the consent and will of the governed. (Mhlambi,  2020 p. 15-16) Government and other powers should be acting in service to the people. This is the purpose of leadership. 

However, it is not what we see from most leaders. Following Ubuntu, rulership is collaborative. That is how things should really be done within governance, by being in service to the people. 

How do we make this value-shift happen and balance power structures?

Focusing on Inclusion to combat exclusion

Arthur Gwagwa suggested that there be more of a focus in research and policy work on “Ubuntu-based action guiding principles for all AI stakeholders.” (Gwagwa, 2022 p. 1) He gave an example of providing guidance to reconcile ethical dilemmas in AI design, including conflicting or competing cultural values. (Gwagwa, 2022 p. 1) This would support the notion of inclusivity that Ubuntu ethics would bring to AI design. 

Gwagwa went on to provide a useful definition of exclusion: ‘‘the inability to participate effectively in economic, social, political, and cultural life, and, in some characterizations, alienation and distance from the mainstream society.’’ (Duffy, 1995) (Gwagwa, 2022 p. 2) This is something that is important to keep in mind, also when thinking about digital identity. 

Rationality vs. Relationality

While reading about Ubuntu and AI ethics, the comparison was continually brought up between rationality versus relationality as to the question, how do we define personhood?  

Personhood as rationality traditionally comes from a Western viewpoint, which is what has modeled machine intelligence, and “has always been marked by contradictions, exclusions, and inequality.” (Mhlambi, 2020) How do we situate what it means to be a person when contemplating “artificial or mechanical personhood”? (Mhlambi, 2020)

Looking to Western ethics, utilitarianism, which tends to be very rationalizing, doesn’t always play out appropriately. Utilitarianism as applied to AI ethics aims to maximize what is good for people and minimize what is bad for them in the long run. (Shulman et. al, 2009) (Gwagwa, 2022 p. 5) This ultimately still leaves some people excluded and disadvantaged, and they continue to be those that are perpetually marginalized. 

Taking a bottom-up approach, African philosophy could address both the disproportionate negative effects of AI on people and work towards global equality and protections. (Mhlambi,  2020 p. 6)

Contrasting Collectivism and Individualism

Individualism, something that I have butted heads with in my own research over the years, desperately needs to be challenged, as it has many flaws. Generally, individualism is the idea that the central point of value in society is the self-complete, autonomous-self individual. (Mhlambi,  2020 p. 7) 

Mhlambi lists several flaws of individualism, including:

  1. Justification of inequality

  2. Power asymmetries and exploitation which disallow upward social mobility

  3. Worsening of inequalities due to lack of upward mobility

  4. Increased inequality and prioritized private interests of those in power causes cycles of political instability (Mhlambi,  2020 p. 7, 10)

These harms are ultimately produced by any system based on individualistic principles. (Mhlambi,  2020 p. 10) My question is, does individualism really fit in with any ethical system? When will we realize that individualism is unethical?

Ethics beyond the human-centered world

Western ethics at best is people-centered, and ignores any connection between us and the Earth; rather, it allows for exploitation of it. “Mastery of nature” was the Enlightenment’s goal of self-realization, which some say that today has transformed into “the mastery of bits and cyberspace.” (Kennington, 1978) (Mhlambi, 2020 p. 9) These ideals “tolerate the inevitability of inequality.” (Mhlambi, 2020 p. 9) Justification of exploitation is incredibly unethical, and for this ideal to be adopted by AI could cause unimaginable problems, where instead technologies should be used to support and protect humanity and the Earth. 

What is currently valued in AI development?

One of the most highly valued and problematic aspects of AI is speed, where perhaps it shouldn’t be the most important thing. In the world of AI, speed can equate to success. It is said that similarity creates speed. However like individualism, similarity has many flaws, including:

  1. Decreased diversity

  2. Filter bubbles

  3. May lead to discrimination ex: race, gender (Mhlambi,  2020 p. 20)

This ties in with individualism coming from a monoculture of Silicon Valley, which promotes excessiveness and greedy competition, as self-interest takes center stage. (Murobe, 2000)  (Mhlambi,  2020 p. 9) Theoretically, this goes against Western ethics as well, which would lead us to act in the best interest of all humans and not put ourselves above others. However, this is not how it works in reality, arguably, because of individualism. 

So where do we turn? In the unique balance which is absent from Western individualism as well as Eastern communism, we find African Ubuntu, which “seeks to avoid the worst of extreme systems.” (Mhlambi,  2020 p. 17)

Ubuntu is about human connectedness with other people, living things and the universe at large.

Ubuntu views humanity as how a person relates in meaningful ways in relation with other persons. “A shared humanity, a oneness and indissoluble interconnectedness between all humans, needs to be the paramount human identity and positionality from which we organize our societies, and produce the technological advances that maintain social harmony.” (Mhlambi,  2020 p. 21)

This is not to say that there is no concept of the individual within Ubuntu ideology. Rather, the individual has many important roles to play. These include:

  1. Doing one’s part to maximize public good

  2. Affirming the dignity of all and restoring breaks in harmony 

  3. Creating the necessary environment for all to thrive (Mhlambi,  2020 p. 24)

My conclusions from Mhlambi’s work lead me to reiterate that inclusion cannot be complete as long as inequality exists. (Mhlambi,  2020 p. 24) 

Ubuntu is a philosophy that encourages us to help each other: Can we apply that to building AI?

Technology is not lacking ethics. Societal values are ever-present in the creation and use of technology: but what ethics are included matters. This gives us a clear view of where society’s ethics stand: with those in power. Compassion, equity and relationality are missing in this, and that is a problem. If actions are taken to shift to these crucial values of Ubuntu and collectivism, this change could start with AI and radiate out to benefit everyone as well as the planet. 

“Personhood must be extended to all human beings, informed by the awareness that one’s personhood is directly connected to the personhood of others.” (Mhlambi,  2020 p. 7)

Resources

Duffy, K. (1995). Social Exclusion and Human Dignity in Europe: Background Report for the Proposed Initiative by the Council of Europe (Strasbourg: Council of Europe)

Gwagwa, A.E. (2021). Africa’s contribution to an intercultural reflective turn

in the ethics of technology in the era of disruption. https://www.academia.

edu/51050494/Africas_contribution_to_an_intercultural_reflective_turn_

in_the_ethics_of_te

Gwagwa, A., Kazim, E., & Hilliard, A. (2022). The role of the African value of Ubuntu in global AI inclusion discourse: A normative ethics perspective. In Patterns (Vol. 3, Issue 4). Cell Press. https://doi.org/10.1016/j.patter.2022.100462

Kennington R. “Descartes and Mastery of Nature.” In: Spicker S.F. (eds) Organism, Medicine, and Metaphysics. Philosophy and Medicine,

vol 7. Springer, Dordrecht, 1978.

Kropotkin, Piotr Alexeievich. Mutual Aid: A Factor or Evolution. New York: McClure Phillips and Co., 1902. 

Mhlambi, S., & School, H. K. (2020). Sabelo Mhlambi Carr Center Discussion Paper Ubuntu as an Ethical & Human Rights Framework for Artificial Intelligence Governance Technology and Human Rights Fellow Carr Center for Human Rights 

Shulman, C., Jonsson, H., and Tarleton, N. (2009). Which consequentialism? Machine ethics and moral divergence. Asia-Pacific Conf. Comput.

Philos. 23–25. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.

1.1.363.2419& rep=rep1& type=pdf.

Murobe, M.F. ‘Globalization and African Renaissance: An ethical reflection’, in Problematising the African Renaissance, E. Maloka and

E.Le Roux (eds). Pretoria: Africa Institute of South Africa, 2000, pp. 43–67.

DBSCAN Algorithm Tutorial in Python

DBSCAN Algorithm Tutorial in Python

Density-based Spatial Clustering of Applications with Noise (DBSCAN)

In my previous article, HCA Algorithm Tutorial,  we did an overview of clustering with a deep focus on the Hierarchical Clustering method, which works best when looking for a hierarchical solution. In the case where we don’t want a hierarchical solution and we don’t want to specify the number of clusters we’re going to use, the Density-based Spatial Clustering of Applications with Noise, abbreviated DBSCAN, is a fantastic choice.

This technique performs better with arbitrary-shaped clusters - clusters without a simple geometric shape like a circle or square- with noise, which means those clusters have data points that don’t belong to them. DBSCAN also aids in outlier detection by grouping points near each other with the use of two parameters termed Eps and minPoints.

Eps, also known as Epsilons, are the radius of the circles we will generate around every data point, while minPoints are the smallest number of data points that must surround a single central data point in order to make it a core point.

All points that are not within another point’s Eps distance and do not have the corresponding number of minPoints within their own Eps distance are called noise or outliers.

The selection of Eps and minPoints must be done with specific focus because a single change in values might influence the entire clustering process. But how can we know the recommended values for our problems?

We can use some estimations according to our problem’s dataset. We should pick Eps depending on the dataset's distance and we may utilize a k-distance graph to assist us. We should be aware that a small number is always preferable since a large value would combine more data points per cluster, and some information can be lost.


Example of variation of eps values. Image extracted from How to Use DBSCAN Effectively

It is evident that the most optimal results are achieved when the value of “eps” lies between 0.17 and 0.25. When the value of eps is smaller than this range, there is an excessive amount of noise or outliers, represented by the green color in the plot. On the other hand, when it’s bigger, the clusters become too inclusive like the eps value of 1, when it’s a single cluster.

The value of minPoints is according to the number of data points we have. However, we must consider that the minimum value can be our dataset dimension, it means the number of features we’re working with plus  1, while we don’t have a maximum value number. Therefore, the larger the data collection, the greater the minPoints value we should select.

Now that we understand the concepts, let’s see how this algorithm works. The first step is to classify the data points. The algorithm will visit all data points but  arbitrarily select one to start with. If the iteration confirms that there are the corresponding number of minPoints in the radius Eps around the datapoint selected, it considers these points part of the same cluster.

Then the algorithm will repeat the process with the neighbors just selected, possibly expanding the cluster until there are no more near data points. At that point, it will select another point arbitrarily and start doing the same process.

It can be possible to have points that don’t get assigned to any clusters, these points are considered noise or outliers, and these will be discarded from our algorithm once we stop iterating through the dataset. 

Now that we have knowledge of how this algorithm works, let’s compute it with a simple Tutorial in Python.

Python Tutorial

As always, let’s use an Anaconda distribution, but in case you are not related to it, you can previously install the packages by executing some pip installing or you can run the following code block, beginning the uncommenting from the “import sys”:

        Now, we’re going to make some basic imports to use them in our program.

        For this tutorial, we’ll use a dataset known from our previous article, a list of mall customers (extracted from Machine Learning A-Z: Download Codes and Datasets - Page - SuperDataScience | Machine Learning | AI | Data Science Career | Analytics | Success). And we’re going to use, as before, the annual income and spending score values.

        At this point we have our data points on memory, let’s compute a plot where we can see those points for a clear view of this example.

        As previously mentioned, to compute this algorithm we need some defined parameters: Eps and minPoints. Although there’s not an automatic way to determine minPoints value, we can compute some functions that will allow us to have our Eps. This will be made by computing the k-distance between all points in the dataset, the elbow of the curve will give us an approximation of the Eps value.  

In this case when we call a function, we will get a float number, or a decimal value. We will have to round it.

In this case, we will have an Eps value of 5, which will be our entry to our next function allowing the estimated minPoints. Next, we will give a label to every cluster, going from -1 (representing the noises or outliers) to 4 (our last visited cluster).

Finally we will make our scatterplot, assigning colors for each of the labels. Finishing this tutorial.

In conclusion, the DBSCAN algorithm is a powerful and versatile method for clustering data in a variety of applications. It is particularly well-suited for handling data with irregular shapes and varying densities, and is able to identify noise points and outliers in the data. DBSCAN is also relatively easy to implement and does not require prior knowledge of the number of clusters in the data, making it a popular choice for exploratory data analysis.

However, like any clustering algorithm, DBSCAN has some limitations and assumptions that should be considered when applying it to real-world data. For example, it assumes that clusters are dense regions separated by areas of lower density, and may not perform well on data with very different density levels or noise points that are distributed throughout the data. Additionally, it requires careful selection of its parameters such as the neighborhood size and the minimum number of points required to form a cluster, which can affect the clustering results.

Overall, the DBSCAN algorithm is a valuable tool for data clustering and has been applied successfully in a wide range of fields including image processing, text mining, and bioinformatics. By understanding its strengths and limitations, researchers and practitioners can make informed choices about when and how to apply DBSCAN to their own data.

If you’re interested in exploring the code behind the project discussed in this article, I invite you to visit the corresponding repository on GitHub. There, you can find the source code and view it in action. By exploring the repository, you’ll gain a deeper understanding of how the project was built and how you might be able to use it for your own purposes. I’m available to answer any questions you may have, so don’t hesitate to reach out if you need support. You can find the link to the repository in the article’s footer or by visiting my GitHub profile. Happy coding!

References:

Hierarchical Clustering Algorithm Tutorial in Python

Hierarchical Clustering Algorithm Tutorial in Python

When researching a topic or starting to learn about a new subject a powerful strategy is to check for influential groups and make sure that sources of information agree with each other. In checking for data agreement, it may be possible to employ a clustering method, which is used to group unlabeled comparable data points based on their features. For example, you can use clustering to group documents by topics.

On the other hand, clustering can also be used in market segmentation, social network analysis, medical imaging, and anomaly detection.

There are different types of clustering algorithms, and their application goes according to the type of problem or approach we want to implement. For example, if you’re searching for a hierarchical method, which implies you’re attempting a multi-lever learning technique and learning at multiple grain-size spaces, you may use hierarchical clustering.

Hierarchical clustering is a prominent Machine Learning approach for organizing and classifying data to detect patterns and group items to differentiate one from another.

The hierarchy should display the data in a manner comparable to a tree data structure known as a Dendrogram, and there are two methods for grouping the data: agglomerative and divisive.

Before entering into the deep knowledge of this, we’re going to explain the importance of a dendrogram on clustering. Not only does it give a better representation of the data grouping, but it also gives us information about the perfect number of clusters we might compute for our number of data points.

The agglomerative method is the most common type of hierarchical clustering, consisting of a “bottom-up” approach in which each object starts in its cluster, called a leaf, and the two most comparable clusters are joined into a new larger cluster at each phase of the algorithm, called nodes.

It is an iterative method, repeated until all points belong to a single large cluster called root that will contain all the data.

The divisive method, which is the opposite of the agglomerative method, is not often used. This divisive approach is less typically employed in hierarchical clustering since it is more computationally costly, resulting in slower performance.

To make this algorithm possible, using an agglomerative approach, we must complete the following steps:

The first step is to create a proximity matrix, computing the distance between each pair of data points, which means the distance between a data point and the others. It’s commonly used the euclidean distance function, given by the formula

Where

Having this in mind, the two data points we’re going to select will be according to our chosen linkage criteria. These criteria should be chosen based on theoretical concerns from the domain of application. There are a commonly used few of them:

  • Min Linkage, also known as single-linkage, is the calculation of distance between the two most comparable components of a cluster, which means the closest data points. It can also be defined as the minimum distance between points. The Min technique has the benefit of being able to handle non-elliptical forms properly. One of its drawbacks is that it is susceptible to noise and outliers.

  • Max Linkage, also known as complete linkage, is based on the two least similar bits of a cluster, or the same thing as the maximum distance between points. This linkage method is more vulnerable to noise and outliers than the MIN technique. It also can separate massive clusters and prefers globular clusters.

  • Centroid Linkage, which calculates the distance between the centroids of each cluster.

  • Average Linkage, defines cluster distance as the average pairwise distance between all pairs of points in the clusters.

When there are no theoretical concerns in our problem, it’s very helpful to use linkage criteria called Ward Linkage. This method examines cluster variation rather than calculating distances directly, reducing variance between clusters. This is accomplished by lowering or reducing the sum of squared distances between each cluster’s centroids. This method comes with the great power of being more resistant to noise and outliers. Now that you’ve calculated the distance using the according linkage criteria, you can merge the data points, creating a new cluster for each pair. After that, all you have to do is keep iterating until you have a single cluster. You may accomplish this by generating a new proximity matrix and computing the distances using the same linking criteria as before.

Tutorial in Python

Now that we’ve passed through all the basic knowledge, we’re ready to enter the python tutorial.

Remember that we’re using the Anaconda distribution, so in case you’re not using it, you’ll need to make the corresponding pip installs if you haven’t used these libraries before.

python -m pip install pip

pip install numpy

pip install pandas

pip install matplotlib

pip install scipy

pip install scikit-learn

Or you can uncomment the first code block in our Jupyter Notebook and execute it.

Now that you have installed the packages, we’re going to start coding. First, we need to import some basic libraries

For this example, we’re using a Comma-separated values file, here we’re using a list of mall customers (extracted from Machine Learning A-Z: Download Codes and Datasets — Page — SuperDataScience | Machine Learning | AI | Data Science Career | Analytics | Success). In this file, we obtain a list of customers and information about their annual income and their spending score, and our goal is to separate them into clusters using the HCA Algorithm.

Now we’re going to assign to a variable our wanted data points, which are the annual income and spending score by customer. It’s stored as an array with only the values.

The next step is to make some other imports.

Next, we generate a dendrogram using the scipy library importing, since our problem isn’t involved in a theoretical approach and we want a simple result, we’re using a ward linkage to treat it.

Taking a break from coding, we want to explain a little about the use of the Dendrogram. As you read before, the Dendrogram will give us the recommended number of clusters we will want to compute for our problem. But how can we know this? It’s as easy as drawing a parallel line so that it intercepts the greatest number of in this case vertical lines, making sure to not hit a horizontal line or ‘branching point.’

When two clusters are merged, the dendrogram will join them in a node, each node has a vertical distance which can be interpreted as the length on the y-axis. In our problem dendrogram.

We see that the intersection of the line with the most considerable distance, which means the greatest distance of a node, marks 5 different groups, which means it recommends five clusters for the problem. Having this in mind, we can advance to the next step.

Note: if the parameter “affinity” gives you an error, try changing it to “metric”, due to sklearn library’s deprecated versions.

We’re going to compute the Agglomerative Clustering; we’re going to use the Euclidean distance method and the same linkage criteria. Notice that we’re passing that we want 5 clusters for this problem.

Then, we just assign the data points to a corresponding cluster.

Finally, we’re going to compute our scatterplot and give each cluster a respective tag, according to their spending score and their annual income.

Throwing the final result, a scatter plot showing the five different clusters we computed and their respective legend.

In Conclusion

Hierarchical clustering is a robust method that works with unlabeled data, which is useful because most data, especially new or original data, does not come pre-labeled and thus requires a significant amount of time to classify/annotate. In this article, you learned the fundamental ideas of clustering, how the algorithm works, and some additional resources for a better understanding, such as dendrograms, euclidean distance computation, and linking criteria.

Despite its benefits, such as not utilizing a fixed number of clusters, we must note that hierarchical clustering does not perform well with huge data sets owing to the high space and complexity of the algorithm. This drawback is due to the need to calculate the pairwise distance between the datasets as discussed in our section on linkages, as well as the analysis of the dendrogram being too computationally intensive on huge data sets. Keeping these facts in mind we understand that on huge data sets Hierarchical clustering may take too long or require too much in the way of computational resources to provide useful results, but this type of algorithm is great for small to standard data sets and particularly useful in early understanding of unlabeled data.

You can stay up to date with Accel.AI; workshops, research, and social impact initiatives through our website, mailing list, meetup group, Twitter, and Facebook.

Introduction to Ubuntu Collectivism Theory Applied to AI Ethics

Justice, inclusivity and solidarity, can we consider these to be universal values?

 

These are some of the main values in the Sub-Saharan African philosophy of Ubuntu, which clarifies many of the core ethics that we find across cultures, such as the age-old golden rule: do unto others as you would want them to do unto you. In essence, it is seeing one’s humanity in the humanity of others. (Gwagwa, 2022 p. 2) 

 

In order to work in a values-first manner, Ubuntu can be useful for informing artificial intelligence (AI) ethics, with an emphasis on inclusivity which is key for AI principles and guidelines that are universally applied. (Gwagwa, 2022 p. 1) Sub-Saharan Africa has historically been excluded from the benefits of past industrial revolutions, as “... its people and their resources and aspirations have been objectified through slavery, colonialism, imperialism, and neo-colonialism.” (Gwagwa, 2022 p. 2) Could Ubuntu inform AI ethics in an effort to create a more inclusive future?

 

One of the core principles of Ubuntu is: “Umu-Ntu ngumu-Ntu nga ba-Ntu” – A person is a person through other persons. What this means is that how one relates to others is interconnected with one’s personhood and measure of ethics. Under this notion, relationality is emphasized, and the successes and failures of individuals are equally the successes and failures of the community. (Mhlambi,  2020 p. 15-16)

 

 The way that ethics is measured in Ubuntu is through how a person relates to others as well as to the environment and all other independent parts. Ubuntu can be described as relational personhood, where relationality means accepting the interconnectedness of others while recognizing their individuality, and generally the connection of people, nature, and the spiritual. (Mhlambi, 2022, p. 13) We could take the classic saying about raising children, that it takes a village, as opposed to individual family units as found in Western cultures. This is a practical example of Ubuntu. One would not ignore a misbehaving child, any nearby adult would reprimand them, as opposed to Western cultures where this would never happen. Another example from an Ubuntu proverb states that you would not walk by a house being built without lending a hand. (“Wadlula ngendl’isakhiwa kayibeka qaza” He passed by a hut being built and did not tie a knot) (Mhlambi, 2022, p. 14)

 

When someone is acting ethically, they are said to “have Ubuntu” or considered “unoBuntu.” Someone acting unethically, by only considering themselves and being distant or unhelpful to others, is thought to not have Ubuntu, or be “akala ubu-Ntu.” If the word Ubuntu is broken down, “Ubu” stands for “a state of being and becoming” and “Ntu” in essence means “the idea of a continuous being or becoming a person oriented towards cosmic unity and creative purpose.” (Mhlambi,  2020 p. 13-14)

 

The question is, what can we learn from Ubuntu when thinking through ethics for AI? This type of relational ethics is important to consider when we think about ethics in AI because of how such powerful technology affects people and the world around us. This brings up a lot of questions. How does AI affect people and the world, and why is it important to have a relational type of ethics for AI? Also, how do values in different parts of the world play a role in relational ethical AI development?

 

AI is shaped by the dominant economic, political, and social inequalities fueled by neocolonial thought and resulting in assaults on human dignity. This can be countered by postcolonial African philosophy when creating AI. (Mhlambi, 2020) Greater inclusion and diversity in global discourse on AI ethics is non-negotiable, and we should be collecting the best tools we can to achieve this. Ubuntu is especially helpful for the inclusion of African voices. (Gwagwa, 2021) (Gwagwa, 2022 p. 5) The importance of collective identity in the struggles of African peoples is stressed by Africanist scholars, (Hall, 2012) (Gwagwa, 2022 p. 5) and this must be considered ongoing as technology affects everyone globally. 

Postcolonial African philosophy’s relevance to the ethics of artificial intelligence is that, as a response to the traumatic encounter between the African world and European modernity, it puts in clear view modernity’s dependency on marginalization and exposes the weaponization of rationality veiled as moral benevolence." (Eze, 1997) (Mhlambi,  2020 p. 6) By starting from a point of relationality, things that are ultimately harmful to fellow human beings and the world around us cannot be rationalized. 

 

A unanimous consensus was reached at the UN Global Pulse in Ghana and Tunisia (Pizzi & Romanoff, 2020), which was that the mistakes of the Global North regarding the development of technologies could be a lesson for Africa to learn from and not repeat: first, formulate a set of values to guide technology, as opposed to thinking of values as an afterthought. “Africans advocated for the need for human control of technology and the promotion of human values, something which has been reactionary rather than proactive in global principles.” (Fjeld & Nagy, 2020) (Gwagwa, 2022 p. 4)

By linking one person’s personhood to the personhood of others, Ubuntu reconciles ethical limitations of rationality as personhood. One cannot be rational when one is only considering oneself. “Rationality is not an individual product or endeavor of a consistent formal system but is a result of interconnected consistent formal systems. Rationality is thus a product of relationality.” (Mhlambi, 2020 p. 3)

Can computers understand relationality? Computers have difficulty around social contexts, particularly racial and gender norms, and automated systems that have access to all of this data end up perpetuating racism and gender stereotypes because the data cannot interpret itself, nevermind be informative on how to respond to and avoid moral dilemmas. (Mhlambi,  2020 p. 4)

Automated decision making systems (ADMS) have five general critiques that are in direct violation of Ubuntu. As listed by Mhlambi (2020, p. 8) these critiques include:

1) the exclusion of marginalized communities and their interests in the design, development, decision making, and funding of ADMS

2) biases resulting in the selection of features in ADMS and biases entrenched in the data that generate these systems

3) power asymmetries worsened by the use of ADMS

4) dehumanization that occurs from the commodification of our digital selves

5) the centralization of the resources and power necessary in designing and using ADMS. (Mhlambi,  2020 p. 8)

Solutions would start by correcting these violations at a fundamental level, and at all points throughout AI, machine learning and ADMS development, production, use and application. 

Here is a list of suggestions from Sabelo Mhlambi that would include the values of Ubuntu going forward:

1) Address the harms to climate change which much of ADMS relies on via cloud computing. (Greenpeace, 2010)

2) Normalize the eradication of inequality through the participation of the most disenfranchised at the start of creating technology.

3) Use data which powers ADMS for public good.

4) Make data publicly available whilst protecting privacy and promoting societal wellbeing.

5) Treat community data as intellectual property, with the ability to be licensed or revoked from online platforms. 

6) Fund and provide access to technical skill sets for the most disenfranchised. 

7) Allow users to directly shape the way they receive recommendations from algorithms. 

8) Tailor technology companies’ recommendations according to agreed upon social ideals which are based on human dignity and social cohesion. (Mhlambi,  2020 p. 25)

Ubuntu is just one representation of non-Western ethics that decentralizes individualism and recenters the importance of relationality and inclusion. It is sometimes difficult to understand when we have been so overexposed to individualism and the rationality that comes from putting the individual above all else. However, by looking for ethical systems outside of the Western world, perhaps the development of technology that affects everyone could benefit more than just the few, and break cycles of colonialism for good. 

Resources

Eze, Emmanuel Chukwudi. Postcolonial African Philosophy: A Critical Reader. Cambridge, Mass.: Blackwell, 1997.

Fjeld, J., and Nagy, A. (2020). Principled Artificial Intelligence: mapping

consensur in ethical and rights-based appraiches to principles for AI.

https://cyber.harvard.edu/publication/2020/principled-ai.

Greenpeace. “Make IT Green: Cloud Computing and Its Contribution to Climate Change.” 2010.

Gwagwa, A.E. (2021). Africa’s contribution to an intercultural reflective turn

in the ethics of technology in the era of disruption. https://www.academia.

edu/51050494/Africas_contribution_to_an_intercultural_reflective_turn_

in_the_ethics_of_te

Gwagwa, A., Kazim, E., & Hilliard, A. (2022). The role of the African value of Ubuntu in global AI inclusion discourse: A normative ethics perspective. In Patterns (Vol. 3, Issue 4). Cell Press. https://doi.org/10.1016/j.patter.2022.100462

Mhlambi, S., & School, H. K. (2020). Sabelo Mhlambi Carr Center Discussion Paper Ubuntu as an Ethical & Human Rights Framework for Artificial Intelligence Governance Technology and Human Rights Fellow Carr Center for Human Rights Policy From Rationality to Relationality: Ubuntu as an Ethical and Human Rights Framework for Artificial Intelligence Governance.

Pizzi, M., and Romanoff, M. (2020). Governance of AI in Global Pulse’s policy work: zooming in on human rights and ethical frameworks. https://www.

unglobalpulse.org/2020/12/governance-of-ai-in-global-pulses-policywork-zooming-in-on-human-rights-and-ethical-frameworks/.


Introduction to Computer Vision: Image segmentation with Scikit-image

Computer Vision is an interdisciplinary field in Artificial Intelligence that enables machines to derive and analyze information from imagery (images and videos) and other forms of visual inputs. Computer Vision imitates the human eye and is used to train models to perform various functions with the help of cameras, algorithms, and data rather than optic nerves and the visual cortex. Computer vision has very significant real-world applications including facial recognition, self-driving cars, and predictive analysis. With self-driving cars Computer Vision (CV) allows the car’s computer to make sense of the visual input from a car’s cameras and other sensors. In various industries, CV is used many tasks such as for x-ray analysis in healthcare, for quality control in manufacturing, and predictive maintenance in construction just to name a few.

Outside of just recognition, other methods of analysis include

  • Video motion analysis which uses computer vision to estimate the velocity of objects in a video, or the camera itself.

  • In image segmentation, algorithms partition images into multiple sets of views which we will discuss later in this article.

  • Scene reconstruction which creates a 3D model of a scene inputted through images or video and is popular on social media.

  • In image restoration, noise such as blurring is removed from photos using Machine Learning based filters.

Scikit-Image

A great tool is Scikit-image which is a Python package dedicated to image processing. We’ll be using this tool throughout the article so to follow along you can use the code below to install it:

pip install scikit-image

# For Conda-based distributions

conda install -c conda-forge scikit-image

Basics for Scikit-image

Before getting into image segmentation, we will familiarize ourselves with the scikit-image ecosystem and how it handles images.

Importing images from skimage library

The skimage data module contains some built-in example data sets which are generally stored in jpeg or png format. We will use matplotlib to plot images which is an amazing visualization library in Python for 2D plots of arrays. You can find the link to our notebook here.

  • Importing a grayscale image

  • Importing a colored image

  • Importing images from an external source

Various factors affect methods used to process images some are color, format, and even size. More high-contrast images would need more advanced tools.

  • Loading multiple images

A ValueError will be raised if images in the ImageCollection don’t have identical shapes.

  • Saving images

Converting image format

RGB color model is an additive color model in which the red, green, and blue primary colors of light are added together at different intensities to reproduce a broad array of colors. RGB is the most common color model used today. Every television or computer monitor uses the RGB color model to display images.

  • RGB to Grayscale

So as to apply filters and other processing techniques, the expected input is a two-dimensional vector i.e. a monochrome image. This is great for basic segmentation that would not work properly with high-contrast images. rgb2gray module of the skimage package is used to convert a 3-channel RGB Image to one channel monochrome image.

Output:

  • RGB to HSV

An HSV (Hue, Saturation, and Value) color model is a color model designed to more closely resemble how the human vision perceives color. HSV is great for editing because it separates out the lightness variations from the hue and saturation variations. rgb2hsv() function is used to convert an RGB image to HSV format.

Output:

Image Segmentation

Image Segmentation is the process of splitting images into multiple layers, represented by an intelligent, pixel-wise mask. Simply put it is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics(for example, color, intensity, or texture). It involves merging, blocking, and separating an image from its integration level. Splitting a picture into a collection of Image Objects with comparable properties is the first stage in image processing. For this article, we will cover image segmentation with thresholding using supervised and unsupervised algorithms.

Thresholding

This is a simple way of segmenting objects in the background by choosing pixels of intensities above or below a certain threshold value. It is a way to create a binary image from a grayscale or full-color image. This is typically done in order to separate “object” or foreground pixels from background pixels to aid in image processing.

Supervised learning

This type of segmentation requires external input that includes things like setting a threshold, converting formats, and correcting external biases.

Segmentation by Thresholding — Manual Input

For this part we will use an external pixel value ranging from 0 to 255 is used to separate the picture from the background. The intensity value for each pixel is a single value for a gray-level image or three values for a color image. This will result in a modified picture that is more or less than the specified threshold as we will see below. To implement this thresholding we first normalize an image from 0–255 to 0–1. A threshold value is fixed and on the comparison, if evaluated to be true, then we store the result as 1, otherwise 0.

Output:

This globally binarized image can be used to detect edges as well as analyze contrast and color differences.

Active Contour Segmentation

An active contour is a segmentation approach that uses energy forces and restrictions to separate the pixels of interest from the remainder of the picture for further processing and analysis. The active contour model is among the dynamic approaches in image segmentation that uses the image’s energy restrictions and pressures to separate regions of interest. It is a technique for minimizing the energy function resulting from external and internal forces. An exterior force is specified as curves or surfaces, while an interior force is defined as picture data. The external force is a force that allows initial outlines to automatically transform into the forms of objects in pictures. Active Contour segmentation also called snakes and is initialized using a user-defined contour or line, around the area of interest. This contour then slowly contracts and is attracted or repelled from light and edges. The snakes model is popular in computer vision, and snakes are widely used in applications like object tracking, shape recognition, segmentation, edge detection, and stereo matching.

In the example below after importing the necessary libraries we will convert our image from the scikit-image package to grayscale. Then we will plot and draw a circle around the astronaut’s head to initialize the snake. active_contour() function active contours by fitting snakes to image features. Gaussian filter is also applied to denoise the image. For the parameters, alpha and beta, higher values of alpha will make this snake contract faster while beta makes the snake smoother.

Output:

Chan-Vese Segmentation

The Chan-Vese segmentation algorithm is designed to segment objects without clearly defined boundaries. The well-known Chan-Vese iterative segmentation method splits a picture into two groups with the lowest intra-class variance. The implementation of this algorithm is only suitable for grayscale images. Some of the parameters used are lambda1 and mu. The typical values for lambda1 and lambda2 are 1. However, if the ‘background’ is very different from the segmented object in terms of distribution then these values should be different from each other, for example, a uniform black image with figures of varying intensity. Typical values for mu are between 0 and 1, though higher values can be used when dealing with shapes with very ill-defined contours. The algorithm then returns a list of values that corresponds to the energy at each iteration. This can be used to adjust the various parameters we have discussed above.

In the example below, we begin by using rgb2gray to convert our image to grayscale. The chan_vese() function is used to segment objects using the Chan-Vese Algorithm whose boundaries are not clearly defined. Then we will plot the output tuple of 3 values which are the original image, the final level image, and one that shows the evolution of energy.

Output:

Unsupervised Learning

This type of image segmentation thresholding algorithm requires no user input. Consider an image that is so large that it is not feasible to consider all pixels simultaneously. So in such cases, Unsupervised segmentation can break down the image into several sub-regions, so instead of millions of pixels, you have tens to hundreds of regions. You may still be able to tweak certain settings to obtain desired outputs.

SLIC (Simple Linear Iterative Clustering)

SLIC algorithm utilizes K-means, a machine learning algorithm, under the hood. It takes in all the pixel values of the image and tries to separate them out into the given number of sub-regions.

SLIC works well with color so we do not need to convert images to grayscale. We will set the subregion to the average of that region which will make it look like an image that has decomposed into areas that are similar. label2rgb() replaces each discrete label with the average interior color.

Output:

Mark Boundaries

This technique produces an image with highlighted borders between labeled areas, where the pictures were segmented using the SLIC method.

In the example below we have segmented the image into 100 regions with compactness = 1 and this segmented image will act as a labeled array for the mark_boundaries() function. The mark_boundaries() function is to return images with boundaries between labeled regions.

Output:

Felzenszwalb’s Segmentation

Felzenszwalb uses minimum-spanning tree clustering for the machine-learning algorithm behind the scenes. Felzenszwaib doesn’t tell us the exact number of clusters that the image will be partitioned into. It will run and generate as many clusters as it thinks are appropriate for that given scale or zoom factor on the image. This may be used to isolate features and identify edges.

In the example below seg.felzenszwalb() function is used to compute Felsenszwalb’s efficient graph-based image segmentation. The parameter scale determines the level of observation. Sigma is used to smooth the pictures before segmentation. Scale is the sole way to control the quantity of generated segments as well as their size. The size of individual segments within a picture might change drastically depending on local contrast. This is useful in confining individual features, foreground isolation, and noise reduction, and can be useful to analyze an image more intuitively.

Output:

We can calculate the number of unique regions the image was partitioned into.

Let’s recolor the image using label2rgb() just like we did with the SLIC algorithm.

It is similar to a posterized image which is essentially just a reduction in the number of colors.

Conclusion

Image segmentation is a vital step in image processing. It is actively researched with its application in traffic and video surveillance to medical imaging. In this article, we have gone over image segmentation techniques using only the scikit image module. You could attempt some of these image segmentation methods with libraries like OpenCV. It is however important to mention some of the image segmentation techniques which use deep learning.

Training Neural Networks with JAX

JAX is a python library made to boost machine learning research using accelerators like TPUs/GPUs. Due to its speed and efficiency coupled with familiarity of Python and Numpy, it has been widely adopted by machine learning researchers. While training neural networks is even faster, another advantage of JAX is it saves memory cost and energy. In this tutorial, we’ll be using JAX to create a simple neural network which we’ll use to solve a regression task. If you are new to JAX this article here is a solid introduction. For our example, we’ll use a small dataset available from yellowbrick which is an open source, pure Python project that extends Scikit-Learn with visual analysis and diagnostic tools.

What is a neural network?

Simply put it is a mathematical function that maps a given input in conjunction with information from other nodes to develop an output. It is inspired and modeled on the human mind. In this tutorial, I won’t explain many of the basics of the neural network so if you’re new to neural networks I will refer to this article here.

Regression

Regression is a method of investigating the relationship between independent variables or features and a dependent variable or outcome. It’s used as a method for predictive modeling in machine learning, in which an algorithm is used to predict continuous outcomes. We will create a neural network with JAX to solve a regression task. We’ll use a dataset for Boston housing from scikit-learn. Below we will import JAX and some of its submodules stax and optimizers that we will use to train neural networks. We have also imported the jax.numpy module as we’ll require it to convert input data to JAX arrays and a few other calculations. Here is the link to our notebook.

Load dataset

We first load the concrete strength dataset available from Yellowbrick. Concrete is the most important material in civil engineering. The concrete compressive strength is the regression problem. The concrete compressive strength is a highly nonlinear function of age and ingredients. We have loaded data features in variable X and target values in variable Y. We split the dataset into the train (80%) and test (20%) sets according to the Pareto principle. The Pareto principle states that “for many events, roughly 80% of the effects come from 20% of the causes”. After dividing the dataset, we converted each NumPy array to a Jax array using jax.numpy.array() constructor. We also printed the shape of the train datasets and test datasets at the end.

Normalize data

To normalize data, we first calculated the mean and standard deviation of the training dataset for each feature of data. We then subtracted the mean from both training and testing sets. Finally, we divided subtracted values by standard deviation. The main reason to normalize data is to bring the values of each feature to almost the same scale. This helps the optimization algorithm gradient descent to converge faster. When values of different features are on a different scale and vary a lot then it can increase training time because the gradient descent algorithm will have a hard time converging.

Creating the Neural Network

The JAX module, stax provides various readily available layers that we can stack together to create a neural network. The process of creating a neural network using the stax module is almost the same as that of creating a neural network using Sequential() API of keras. The stax module provides a serial() method that accepts a list of layers and activation functions as input and creates a neural network. It applies the layers in the sequence in which they are given as input when performing forward pass-through data. Using the Dense() method we can create fully connected and dense layers. We can also provide a weight initialization and bias initialization function if we don’t want internal initialization performed by JAX after we create the layer using Dense().

Most stax module methods return 2 callable functions as output when executed:

  1. init_fun — This function takes seed for weight initialization of that layer/network and input shape for that layer/network as input. It then returns weights and biases. For a single layer, it returns just weights and biases as output and for the neural network, it’ll return a list of weights and biases.

  2. apply_fun — This function takes weights & biases of layer/network and data as input. It then executes the layer/network on input data using weights. It performs forward pass-through data for the network.

All activation functions are available as basic attributes of the stax module and we don’t need to call them with brackets. We can just give them as input to the serial() method after layers and they will be applied to the layer’s output.

Below is an example of a Dense() layer with 5 units to show the output returned by it. We can notice that it returns two callable functions which we described above.

We then created our neural network whose layer sizes are [5,10,15,1] the last layer is the output layer and all other layers are hidden layers. We have then created layers using the Dense() method followed by Relu (Rectified Linear Unit) activation function. We’ll provide all the layers and activation to the serial() method in sequence. The Relu function that we have used takes as input an array and returns a new array of the same size where all values less than 0 are replaced by 0.

By calling the init_fun() function we have initialized the weights of our neural network. We have given seed i.e (jax.random.PRNGKey(123)) and input data shape as input so it functions. The seed and shape information will be used to initialize the weights and biases of each layer of the neural network.

Below I have printed the weights and biases for each layer after initializing the weights.

We can perform a forward pass through our neural network. For this, we will take a few samples of our data and give them as input to the apply_fun() function along with weights. First, the weights are given followed by a small batch of data then apply_func() will perform one forward pass-through data using weights and return predictions.

Define loss function

In this part, we will calculate the gradient of the loss function with respect to weights and then update weights using gradients. We’ll use Mean squared error loss as our loss function. It simply subtracts predictions from actual values, squares subtracted values, and then the mean of them. Our loss function takes weights, data, and actual target values as input. It then performs a forward pass through the neural network using the apply_fun() function providing weights and data to it. The predictions made by the network are stored in a variable. We can then actually calculate MSE using actual target values and predictions.

Train Neural Network

We will create a function that we will call to train our neural network. The function takes data features, target values, number of epochs, and optimizer state as input. The Optimizer state is an object created by the optimizer that has our model’s weights.

Our function loops a number of epochs time, each time, it first calculates loss value and gradients using the value_and_grad() function. This function takes another function as input, the MSE loss function in our case. It then returns another callable which when called will return the actual value of the function as well as the gradient of the function with respect to the first parameter which is weights in our case. In this instance, we have given our loss function to the value_and_grad() function as input and then called the returned function by providing weights, data features, and target values. With these three as inputs of our loss function, the call will return MSE value and gradients for weights and biases of each layer of our neural network.

Then we will call an optimizer state update method that takes epoch number, gradients, and current optimizer state that has current weights as inputs. The method returns a new optimizer state which will have weights updated by subtracting gradients from it. We will print MSE at every 100 epochs to keep track of our training progress and finally, we return the last optimizer state (final updated weights).

Now that we have initialized an optimizer for our neural network we can go into what it is. The optimizer is an algorithm responsible for finding the minimum value of our loss function. The optimizers module available from the example_libraries module of jax provides us with a list of different optimizers. In our case, we’ll use the sgd() (gradient descent) optimizer. We initialized our optimizer by giving it a learning rate of (0.001).

The optimizer returns three callables necessary for maintaining and updating the weights of the neural network.

  1. init — This function takes weights of a neural network as input and returns the OptimizerState object which is a wrapper for holding and updating weights.

  2. update_fn — This function takes epoch number, gradients, and optimizer state as input. It then updates weights present in the optimizer state object by subtracting learning times gradients from it. It then returns a new OptimizerState object which has updated weights.

  3. params_fn — This function takes the OptimizerState object as input and returns the actual weights of the neural network.

Here we will train the neural network with the function we created in the previous cell. After initializing the optimizer with weights, we have called our training routine to actually perform training by providing data, target values, number of epochs, and optimizer state (weights). We are a training network for 2500 epochs.

Output:

As we can surmise from MSE getting printed every 100 epochs the model is getting better at the task.

Make Predictions

In this section, we have made predictions for both train and test datasets. We retrieved weights of the neural network using the params_fn optimizer function. We have then given weights and data features as input to the apply_fn method which will make predictions.

Evaluating the Model performance

Here we will evaluate how our model is actually performing. We are going to be calculating the R² score for both our train and test predictions. We are calculating the R² score using the r2_score() method of scikit-learn. The score generally returns the value in the range [0,1] where a value near 1 indicates a good model.

We can notice from the R² score that our model seems to be doing a good job.

Train the Model on Batches of Data

Some datasets are quite large and do not really fit into the main memory of the computer. In cases like this, we only bring a small batch of data into the main memory of the computer to train the model a batch at a time until the whole data is covered. The optimization algorithm used in this instance is referred to as stochastic gradient descent and it works on a small batch of data at a time.

The function we have below takes data features, target values, number of epochs, optimizer state (weights), and batch size (default 32) as input. We will perform training loops a number of epoch times while calculating the start and end indexes of our batch of data for each training loop. We will be performing a forward pass, calculating loss, and updating loss on a single batch of data at a time until the whole data is covered. When we are training data in batches we update weights for each batch of data until the whole data is covered for a number of epochs.

Now we have the function for training our neural network. We will initialize the weights of the neural network using init_fun by giving seed and input shape to it. Next, we initialized our optimizer by calling the sgd() function giving a learning rate (0.001). Then we created the first optimizer state with weights and then called our function from the previous cell to perform training in batches. We will be training the neural network for 500 epochs.

Making predictions in batches

Because all the data cannot fit into our main memory we will make predictions in batches. Below is a function that takes weights and data as input and then makes predictions on data in batches.

We will call the function above to make predictions on test and train datasets in batches. We will combine the prediction of the batches.

Evaluate model performance

We will calculate our R² score on train and test predictions to see how our network is performing.

Conclusion

You can attempt the example above yourself even with other tasks like classification which will use similar code. With modules like Stax and optimizers from JAX, you use less code which helps with efficiency. There are more libraries and modules from JAX to explore that may improve your machine learning research. As you can see with JAX you can vastly improve the speed of your machine learning research depending on your field of research.

References

https://coderzcolumn.com/tutorials/artificial-intelligence/create-neural-networks-using-high-level-jax-api

https://coderzcolumn.com/tutorials/artificial-intelligence/guide-to-create-simple-neural-networks-using-jax

https://jax.readthedocs.io/en/latest/notebooks/neural_network_with_tfds_data.html

SVD Algorithm Tutorial in Python

Singular Value Decomposition Algorithm



The Singular Value Decomposition is a matrix decomposition approach that aids in matrix reduction by generalizing the eigendecomposition of a square matrix (same number of columns and rows) to any matrix. It will help us to simplify matrix calculations.

If you don’t have clear concepts of eigendecomposition, I invite you to read my previous article about Principal Component Analysis Algorithm, specifically section 3: “Calculate the Eigendecomposition of the Covariance Matrix” (PCA Algorithm Tutorial in Python. Principal Component Analysis (PCA) | by Anthony Barrios | Accel.AI | Apr, 2022 | Medium). It will be of great help since SVD is a very similar approach to PCA Algorithm but made in a more general way. PCA does an assumption of the input square matrix, while SVD doesn’t.

In general,  when we work with real-number matrices, the formula of SVD is the following:

M = UVT

Where M is the m x n matrix we wish to decompose, U is the left singular m x m matrix that contains eigenvectors of the matrix MMT, the greek letter Sigma represents a diagonal matrix containing the square roots of the eigenvalues of MM* or M*M, arranged in descending order; V is the right singular n x n matrix, containing eigenvectors of matrix MTM.a

For a simple understanding of the function of each matrix, we can say that matrices U and V* cause rotation on the matrix, while the Sigma matrix causes scaling. A singular matrix refers to a matrix whose determinant is zero, indicating it doesn’t have a multiplicative inverse.

Python Tutorial

That’s it! Now, let’s see a basic example of this algorithm using Python. We’ll consider this matrix for our demonstration.


The thing about Python and some libraries is that we can make the whole SVD Algorithm by calling a function. But we can also recreate it to watch the step-to-step process. The first thing we’ll do is import the libraries, we’re using NumPy and SciPy. SciPy is a Python open-source library that is used to solve mathematical, scientific, engineering, and technological issues. It enables users to alter and view data using a variety of high-level Python commands. It is based on the NumPy Python extension. You can follow along in this Jupyter Notebook.


Now we’re going to create some functions that make the corresponding calculations, they’re all commented in case you want to see them.



We create our matrix, we’re calling it “A”.



Now we’ll assign values to our variables by calling the functions we created in the previous steps.



When we print our variables, we’re going to obtain the following output:



And that’s pretty much everything. Now, Python allows us to call the SVD function (imported from the SciPy library), making the calculations pretty simple and with no error margin. 




As we can see, the values are pretty the same, except for some signs changing in the values. Saying this, every investigator always wants to work quickly and efficiently, so we encourage them to work with the SciPy function.

Applications

Now that we have the basics, we don’t always want to use the SVD algorithm for a simple decomposition. There are many applications that we can do.

SVD can be used to calculate the Pseudoinverse of the matrix. This is an extension of the matrix inverse for square matrices to non-square ones (meaning they have a different number of rows and columns). It’s useful when recovering information lost from matrixes that don’t have an inverse. The SVD will compute the pseudoinverse of the matrix to work with it.

But we know that SVD Algorithm is widely used as a Dimensionality Reduction method, specifically in image compressions. Saying this, let’s see a Python example for image compression using the SVD Algorithm.

Image Compression in Python using SVD Algorithm

When we want to compress a file, we’re always looking for the most efficient approach with the lowest amount of unnecessary data. The smaller the image, the less the cost of storage and transmission. The SVD Algorithm will help us by decomposing a given matrix (an image is a matrix with different values representing colors) into three matrices to finally represent the image with a smaller set of values.

That way, image compression will be achieved while preserving the important features that make the original picture.

To start working with the algorithm, we’re going to pick the first k elements of each matrix. In the following example, we’re going to use the SVD Algorithm and show some variations according to the number of elements we’re going to work with. For this demonstration, we’ll use the photo of a kitten.


First, we’re going to import our libraries



At this point, we’re very familiar with NumPy. Let’s add other libraries like Matplotlib, a charting library for Python and a NumPy extension. Matplotlib contains image, where basic image loading, rescaling and display actions are supported. Specifically, it contains imread, a function that will help us by reading the image as an array.

Also, we’re importing the pyplot interface, it provides an implicit, MATLAB-like graphing method.

Finally the os library. It has features for creating and deleting directories (folders), retrieving their contents, updating and identifying the current directory, and so forth.

Next, we'll adjust the size of the graphic, read our picture, and convert it to grayscale color, making it easier to see the distinctions between the several photos we'll be displaying.


When we call the plt.show function, it will display our new photograph in a grayscale.



Now, let’s see where the magic begins. We’re going to compute the SVD Algorithm using the function imported in NumPy.




At first, this might be tricky to watch, but what we’re doing here is extracting the diagonal entries in the Sigma matrix and arranging them in descending order. Then, iteratively, we’re going to select the first k values of every matrix. Selecting the U’s columns and VT’s rows. This way we’re going to watch the closest way we can approach the original image.

When calling the last function, it will show us three different approaches with the given k values.

The first one won’t show us anything clear, but we can guess that the image is showing us a cat because of the pointy ears.



In the next picture, we can clearly see a cat, but obviously, it’s a blurry image, so we can’t appreciate the details the original image has.



The final image has a greater number of k values, which allows us to see a more clear picture. Sure, it has some details missing, but it’s a great number of values we can use for compression. This image’s great for future use in a project we want to work on, maybe a cat album page or a veterinary site.

In conclusion

The Singular Value Decomposition Algorithm is a powerful tool for dimensionality reduction, in this article, we were capable of making a quick review of some math terms that helps us know how this algorithm works and how can it be applied in important fields likewise image compression. 

In the case of image compression, being of great help when it comes to transmission and storage reduction, we noticed that simple use of this algorithm is achieved by choosing an adequate value of k. While we’re increasing this number, the original matrix will be reconstructed almost like the original image. That way, we can notice that this is a simple algorithm since it doesn’t have a high computational complexity.

But image compression is not the only thing that this algorithm work with but with other elements like a database itself or video compressions. I’ll encourage you to make further reading and practice with the algorithm we worked on in the Python tutorial sections. Nice coding!


References:

Introduction to the JAX library for ML in Python

Introduction to the JAX library for ML in Python

JAX (Just After eXecution) is a recent machine learning library used for expressing and composing numerical programs. JAX is able to compile numerical programs for the CPU and even accelerators like GPU and TPU to generate optimized code all while using pure python. JAX works great for machine-learning programs because of the familiarity of Python and NumPy together with hardware acceleration. This is great for the definition and composition of user-wielded function transformations. These transformations include automatic differentiation, automatic batching, end-to-end compilation (via XLA), parallelizing over multiple accelerators, and more. Researchers use it for a wide range of advanced applications, from studying training dynamics of neural networks to developing Machine Learning solutions, to probabilistic programming, to developing accelerated numerical code, and to scientific applications in physics and biology. Various tests have shown that JAX can perform up to 8600% faster when used for basic functions. This is highly valuable for data-heavy application-facing models, or just for getting more machine learning experiments done in a day.

Already understand why you want to use JAX? Jump forward to the code!

Some of its vital features are:

  • Just-in-Time (JIT) compilation.

  • Enabling NumPy code on not only CPUs but GPUs and TPUs as well.

  • Automatic differentiation of both NumPy and native Python code.

  • Automatic vectorization.

  • Expressing and composing transformations of numerical programs.

  • An Advanced (pseudo) random number generation.

  • There are more options for control flow.

JAX’s popularity is rising in the deep-learning industry because of its speed it is used increasingly in machine learning programs and accelerating research. JAX provides a general foundation for high-performance scientific computing this is useful in many various fields and instances, not just deep learning. Even if most of your work is not in Python but if you want to build some sort of hybrid model-based / neural-network system, then it is probably worth it to use JAX going forward. If most of your work is not in Python, or you’re using some specialized software for your studies (thermodynamics, semiconductors, etc.) then JAX probably isn’t the tool for you, unless you want to export data from these programs for some sort of custom computational processing. Suppose your area of interest is closer to physics/mathematics and incorporates computational methods (dynamical systems, differential geometry, statistical physics) and most of your work is in e.g. Mathematica. In that case, it’s probably worth it to stick with what you’re using, especially if you have a large custom codebase.

Getting started with JAX

You can follow along in this Jupyter Notebook, here we install JAX easily with pip in our command line:

This however supports CPU only which is useful for local development. If you want both CPU and GPU support you should first install CUDA and CuDNN if not already installed. Also, make sure to map the jaxlib version with the CUDA version you have.

Here is the JAX installation Github Guide for more installation options and troubleshooting.

We will import both JAX and Numpy into our notebook link here for a comparison of different use cases:

Why use JAX?

Accelerated Linear Algebra (XLA compiler) — XLA is a domain-specific compiler for linear algebra that has been used extensively by Tensorflow and is one of the factors that make JAX so fast. In order to perform matrix operations as fast as possible, the code is compiled into a set of computation kernels that can be extensively optimized based on the nature of the code.

Examples of such optimizations include:

  • Fusion of operations: Intermediate results are not saved in memory

  • Optimized layout: Optimize the “shape” an array is represented in memory

Just-in-time compilation to speed up functions — Just-in-time compilation is a way of executing code that entails the compilation of the code at run time rather than before the execution. Just-in-time compilation comes with Accelerated Linear Algebra (XLA compiler). If we have a sequence of operations, the @jit decorator comes into play to compile multiple operations together using XLA. In order to use XLA and jit, one can use either the jit() function or @jit decorator.

Using the timeit command we can see the improvement in execution time is quite clear. We use block_until_ready because JAX uses asynchronous execution by default. Although this is incredibly useful in deep learning, jit is not without limitation. One of its flaws is when you use “if” statements in your function jit may likely be unable to represent your function accurately.

Auto differentiation with grad() function

As well as evaluating numerical functions, we also want to transform them. One transformation is automatic differentiation. JAX is able to differentiate through all sorts of python and NumPy functions including loops, branches, recursions, and more. This is very useful in deep learning as backpropagation becomes very easy.

In the example below, we define a simple quadratic function and take its derivative on point 1.0. We will find the derivative manually as well In order to prove that the result it’s correct.

There is so much more to doing auto differentiation with JAX, if you are interested in its full capabilities, you can find more about it in the official documentation.

Auto-vectorization with vmap

Another transformation in JAX’s API that you might find useful is vmap(), the vectorizing map. It has the familiar semantics of mapping a function along array axes, but instead of keeping the loop on the outside, it pushes the loop down into a function’s primitive operations for better performance. When composed with jit(), it can be just as fast as adding the batch dimensions beforehand. In the example below we will take a function that operates on a single data point and vectorize it so it can accept a batch of these data points (or a vector) of arbitrary size.

vmap batches all the values together and passes them through the function so it squares all values at once. When d(x) run without vmap the square of each value was computed one at a time and the result was appended to the list. Needless to say, this results in an increase both in speed and memory consumption.

Replicate computation across devices with pmap

Pmap is another transformation that enables us to replicate the computation into multiple cores or devices and execute them in parallel. It automatically distributes computation across all the current devices and handles all the communication between them. You can run jax.devices() to check out the available devices.

Notice that DeviceArray is now SharedDeviceArray this is the structure that handles the parallel execution. JAX supports collective communication between devices. If for example, we want to perform an operation on values from different devices. To perform this, we need to gather all the data from all devices and find the mean.

The function above collects all “x” from the devices, finds the mean, and returns the result to each device to continue with the parallel computation. The code above however will not run unless you have more than one device communicating with each other to have the parallel computation. With pmap, we can define our own computation patterns and exploit our devices in the best possible way.

Control flows

In Python programming, the order in which the program’s code is executed at runtime is called control flow. The control flow of a Python program is regulated by conditional statements, loops, and function calls.

Python has three types of control structures:

  • Sequential — whose execution process happens in a sequence.

  • Selection — used for decisions and branching, i.e., if, if-else statements

  • Repetition — used for looping, i.e., repeating a piece of code multiple times.

Control flow with autodiff

When using grad in your python functions you can use regular python control-flow structures with no problems, as if you were using Autograd (or Pytorch or TF Eager).

Control flow with jit

Control flow with jit however is more complicated, and by default, it has more constraints.

When jit-compiling a function we want to compile a function that can be cached and reused for many different argument values. To get a view of your Python code that is valid for many different argument values, JAX traces it on abstract values that represent sets of possible inputs. There are multiple different levels of abstraction and different transformations which use different abstraction levels. If we trace using the abstract value we get a view of the function that can be reused for any concrete value in the corresponding functions (e.g while working on different sets of arrays) which means we can save on compile time.

The function being traced above isn’t committed to a specific concrete value. In the line with if x < 3 this expression x < 3 is a boolean. When Python attempts to coerce that to a concrete True or False, we get an error: we don’t know which branch to take, and can’t continue tracing. You can relax the traceability constraints by having jit trace on more refined abstract values. We could use the static_argnums argument to jit, we can specify to trace on concrete values of some arguments.

Asynchronous dispatch

Essentially what this means is control is returned to the python program even before operations are complete. It instead returns a DeviceArray which is a future, i.e., a value that will be produced in the future on an accelerator device but isn’t necessarily available immediately. The future can be passed to other operations before the computation is completed. Thus JAX allows Python code to run ahead of the accelerator, ensuring that it can enqueue operations for the hardware accelerator without it having to wait.

Pseudo-Random number generator (PRNG)

A random number generator has a state. The following “random” number is a function of the previous number and the state. The sequence of random values is limited and does repeat. Instead of a typical stateful PseudoRandom Number Generator (PRNGs) as in Numpy and Scipy, JAX random functions require a PRNG state to be passed explicitly as a first argument.

Something to note is PRNGs work well when dealing with vectorization and parallel computation between devices.

JAX vs NumPy

  • Accelerator Devices — The differences between NumPy and JAX can be seen in relation to accelerator devices, such as GPUs and TPUs. Classic NumPy’s promotion rules are too willing to overpromote to 64-bit types, which is problematic for a system designed to run on accelerators. JAX uses floating-point promotion rules that are more suited to modern accelerator devices and are less aggressive about promoting floating-point types similar to Pytorch.

  • Control Behavior — When performing unsafe type casts JAX’s behavior may be backend dependent, and in general, may diverge from NumPy’s behavior. Numpy allows control over the result in these scenarios via the casting argument JAX does not provide any such configuration, instead directly inheriting the behavior of XLA: ConvertElementType.

  • Arrays — JAX’s array update functions, unlike their NumPy versions, operate out-of-place. That is, the updated array is returned as a new array and the original array is not modified by the update.

  • Inputs — NumPy is generally happy accepting Python lists or tuples as inputs to its API functions JAX however returns an error. This is deliberate because passing lists or tuples to traced functions can lead to silent performance degradation that might otherwise be difficult to detect.

Conclusion

I have briefly covered what makes JAX a great library and it promises to make ML programming more intuitive, structured, and clean. There is so much more to this library that we haven’t covered go ahead and explore more in-depth uses of JAX. You can learn more from its documentation here.

JAX also provides a whole ecosystem of exciting libraries like:

  • Haiku is a neural network library providing object-oriented programming models.

  • RLax is a library for deep reinforcement learning.

  • Jraph, pronounced “giraffe”, is a library used for Graph Neural Networks (GNNs).

  • Optax provides an easy one-liner interface to utilize gradient-based optimization methods efficiently.

  • Chex is used for testing purposes.

Follow me here for more AI, Machine Learning, and Data Science tutorials to come!

You can stay up to date with Accel.AI; workshops, research, and social impact initiatives through our website, mailing list, meetup group, Twitter, and Facebook.

References

https://jax.readthedocs.io/en/latest/notebooks/quickstart.html

https://theaisummer.com/jax/

https://developer.nvidia.com/gtc/2020/video/s21989

https://www.shakudo.io/blog/a-quick-introduction-to-jax

My First FAccT Conference #FAccT2022

TLDR: Attending the ACM FAccT conference in Seoul, South Korea for the first time this year, gave me a bit of hope that there are more of me. By that I mean, more people who are sitting at their computers trying to find answers to impossible questions about how to overturn longstanding systems of power and control, reach a place of fairness, accountability, and transparency, and transfer that to artificial intelligence systems. Much of the work is closely examining the depths of the problem, and it is rigidly systemic. It is beautiful to see the work being done, and so many people in one place who have dedicated their lives to the task.

My First Day at FAcct

On the first day, I was early, but not that early. I set out from my hotel room in the exact opposite direction that I needed to for the conference hall, flagged down a taxi, and hoped I was headed the right way. I couldn’t miss the huge Coex building, which stood as modern as can be, across from a mountain dotted with Korean Buddhist temples and a giant statue. I gestured to the driver to stop, almost speaking to him in Khmer, but realizing I was not in Cambodia, where I have been living and working remotely. I was in Korea!

Being a Volunteer Coordinator

I love to volunteer at events. You get more of the back-stage feel and get to be a part of the inner workings of it all. After getting my paper submission rejected for the conference, I was delighted to still be accepted as a volunteer coordinator. I hoped that it would put me in a position to meet some of the other volunteers and organizers, in addition to helping the event run smoothly overall.

I knew that my co-volunteer coordinator was at the reception desk and that she was looking for me. I told her I had short light-colored hair. (I am never quite sure what my hair color is, it had been light pink but coming out of the shower, it could really be closer to blond now, depending on the light.)

She spotted me right away, a young and bright Korean girl with hair lighter than mine, half tied in a black bow. She smiled at me, we exchanged names and pleasantries, and I told her whatever she needed me to do, I was ready.

She took me through the crowds of people checking in, and down the corridor to the volunteer room, which was a quiet escape from the hustle and bustle of the conference. There we sat down and started sending messages to the volunteers who were on the schedule for the day, a task we would do every day of the conference. In the background, the first keynote presenter was on someone’s screen, talking from just the other room. As a hybrid conference, many of the talks could be attended either in person or virtually, or both.

We fell into a great rhythm together, and on a break, we walked down to the mall on a floor below the conference hall. We felt like kids sneaking off to go shopping. She took me into a photo booth, which is quite popular in Korea, and we got some polaroid pictures taken together. It was great fun.

Our photo-booth pictures at FAacct#2022

The People I Met at FAccT2022

The conference was attended by a mashup of folks from disciplines ranging from computer science to law to philosophy to activism. I didn’t meet any other anthropologists, however, I did meet a couple of people who said they were wannabe anthropologists. What they all had in common was a connection to ethics in artificial intelligence, primarily, as the namesake of the conference says, Fairness, Accountability, and Transparency in sociotechnical systems.

Some of the best conversations happened on the bus ride to get COVID tested, as was required for those just arriving in Korea. They took us in small groups in a van to a testing site. On the way, I met a few interesting people, including a Ph.D. student researching the philosophy of ethics in AI at Georgetown and another from Australia studying the legal angle for ethical AI development. One man from Michigan started asking about my research and expressed that he was there with the purpose to try and find solutions to the many outstanding questions of this industry.

In retrospect, I find this an odd goal in an ethical AI space. Seeking solutions is often what gets us into trouble. A lot of terrible things are done in the name of solutions to big problems. Later on, in a conversation with another guy who commented on my work in the decolonial space. He said that he didn’t really understand decolonization, and asked, was there some end goal to focusing on this topic? Were we all meant to go back to a sort of primitive time before colonization began?

I now notice a theme in these lines of questioning, which tend to come from educated white men, that is inherently problematic for me. It has to do with needing to know the goal or the solution. I am not saying that there are no solutions to the complexity involved in attempting to act from an ethical place in sociotechnical fields: there just isn’t one solution or one goal. This whole solutions-oriented concept to technical ethics applications needs a re-think.

These solution-minded people need to take a seat for a moment and focus on listening, and keep listening. To specifically answer the decolonization question, I would offer that it is more about making corrections and learning from the past at every turn, and not repeating the mistakes of the past, but finding ways to interrupt the systems of oppression that like to categorize and therefore minoritize people, causing further disadvantages to those already struggling in a world that is made to benefit a select few.

This is extremely challenging. Systems of oppression are deeply ingrained into technology and how it works, and as systems become more autonomous, they are learning to oppress people all on their own.

Systems of oppression are deeply ingrained into technology and how it works, and as systems become more autonomous, they are learning to oppress people all on their own.

Thoughts on the Many Sessions at FAccT

I found comfort in my own familiarity with the material being presented. Many of the talks I attended touched on areas I have researched, and many used examples I am familiar with, such as Dr. Joy’s work with facial recognition and the recidivism algorithm-gone-wrong. The concepts of fairness, accountability, and transparency were certainly thematic, particularly in the CRAFT poster sessions. There were a lot of fresh concepts as well, including things that went a bit over my head and littered with computer science jargon.

One of the first panels I attended was about intersectionality, which I was excited about because intersectional feminism is something I have always been interested in. In computing and AI, binaries and categories are common and useful. However, people don’t fit so neatly into these distinctions. And it can be harmful to force them into these boxes for the benefit of data mining and the like.

A really good point that was made on the panel was that it is impossible to always include everyone. No matter what you do, some people or groups will be excluded. It has to be decided to what extent this is tolerated on a case-by-case basis. Another good point was that when it comes to intersectionality, issues can have multiplicative effects and are not just additive. So if a person is female, African American, and disabled, that is a huge multiplication of features that stray from the majoritized group of able-bodied white males.

I use the word majoritized and also minoritized, because white males are not actually the majority. One might think so by looking at all of the research that historically supports them and ignores and invisibilizes all others. So-called minorities are not always the minority, especially globally. Research tends to be Western and generally US-focused, not allowing for the reality that there are whole continents of non-white people who get put into a minority category when the technologies being researched are deployed globally. It makes me a bit mad, but it was great to be in a space where people recognize the same issues.

Final Takeaways

Overall FAcct this year was a space full of shy, geeky, sweet, loving individuals who care about the world and what happens to it, which I just adored! It was sometimes difficult to strike up a conversation with someone, but shells were thin and once eyes met, smiles could be felt under our masks.

In a world that is far from fair, accountable, or transparent, these individuals stand for bringing these qualities into the light, and that is a beautiful thing. Hopefully, with all these bright young minds focused on the complex issues that stem from AI and tech but that radiate out into the world, this conference and its attendees can serve to create the positive impact we’re intending to.

Sometimes it seems like we are up against a giant in tech, and there are a lot of forces trying to keep things as they are, and even go backward. Progress does not come naturally. Just because technology advances, it does not mean it is up with the times. That takes a lot of work. I hope that it can radiate out of the bubbles we make and the research papers we write.

And I can’t wait for another trip to Korea!

With hope and love, Jennafer Shae Roberts

Exploring the streets of Seoul, South Korea

You can stay up to date with Accel.AI; workshops, research, and social impact initiatives through our website, mailing list, meetup group, Twitter, and Facebook.